Variants and exons of the glyt1 transporter

ABSTRACT

The present invention provides polypeptide and polynucleotide sequences for novel splice variants of the sodium and chloride-dependent glycine transporter type 1 (GlyT1). These polypeptides and polynucleotides are useful in the treatment and diagnosis of disorders such as neurological and psychiatric disorders including schizophrenia. The invention also provides antibodies directed specifically against these novel polypeptides, and kits comprising the herein-described polynucleotides, polypeptides, and/or antibodies.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.10/484,690, filed Sep. 27, 2004, which is the national stage ofinternational application No. PCT/IB02/03386, filed Jul. 22, 2002, whichclaims the benefit of U.S. Provisional Application Ser. No. 60/307,685,filed Jul. 24, 2001, the disclosures of which are hereby incorporated byreference in their entireties, including all figures, tables and aminoacid or nucleic acid sequences.

The Sequence Listing for this application is labeled “seq-list.txt”which was created on Feb. 15, 2008 and is 166 KB. The entire contents ofthe sequence listing is incorporated herein by reference in itsentirety.

FIELD OF THE INVENTION

The present invention is directed to polynucleotides encoding novelexons and novel splice variants of the sodium and chloride-dependentglycine transporter type 1 (GlyT1), and their use in the treatment anddiagnosis of neurological and psychiatric disorders such asschizophrenia. The invention also deals with antibodies directedspecifically against these novel polypeptides which are useful, e.g., asdiagnostic reagents.

BACKGROUND OF THE INVENTION

Neurotransmitter transporters play a critical role in the regulation ofsynaptic transmission. These transporters, which are located on thepre-synaptic terminal and surrounding glial cells, sequesterneurotransmitter from the synapse, thereby regulating the synapticconcentration of neurotransmitter and influencing the duration andmagnitude of synaptic transmission. Transporters also help to limit theextent of synaptic transmission by preventing the spread of transmitterto neighboring synapses. In view of the important role played by thesetransporters in neurological function, they represent attractive targetsfor pharmacological modulation, potentially providing novel methods oftreatment for any of a number of psychological and neurologicalconditions.

The amino acid glycine functions at both inhibitory and excitatorysynapses in the central and peripheral nervous systems of mammals. Theexcitatory and inhibitory functions of glycine are mediated by twodifferent types of receptor, each of which is associated with adifferent type of glycine transporter. At excitatory synapses, glycineacts as an obligatory co-agonist at a class of glutamate receptorscalled N-methyl-D-aspartate (NMDA) receptors. Activation of thesereceptors in neurons increases sodium and calcium conductance, therebydepolarizing the neuron and increasing the likelihood that the neuronwill fire an action potential.

The class of glycine transporter thought to be involved in excitatorysynapses in conjunction with NMDA receptors is Glyt-1. At least fourvariants of GlyT-1 (GlyT-1a, GlyT-1b, GlyT-1c, and Glyt-1d), have beendescribed. Both GlyT1 and GlyT2 transporters are members of a broaderfamily of sodium- and chloride-dependent neurotransmitter transporters,the members of which typically have 12 transmembrane domains (Olivareset al. (1997) J. Biol. Chem. 272:1211-1217; Uhl, Trends in Neuroscience15: 265-268, 1992; Clark et al, BioEssays 15: 323-332, 1993). Both theN- and C-termini of the members of this family are thought to beintracellular.

NMDA receptor activity has been implicated in a large number ofpsychological and neurological functions, such as learning and memory,and in a large number of diseases and conditions, includingschizophrenia, dementias, attention-deficit hyperactive disorder, andvarious neurodegenerative disorders. Thus, modulators of GlyT1 proteinscan used to treat these and other conditions. The present inventionaddresses these and other needs.

SUMMARY OF THE INVENTION

The present invention pertains to polynucleotides and polypeptidescorresponding to cDNA sequences encoding 8 novel splice variants of theGlyT1 glycine transporter. Oligonucleotide probes or primers hybridizingspecifically with the novel cDNA sequences are also part of the presentinvention, as are DNA amplification and detection methods using saidprimers and probes.

A further object of the invention consists of recombinant vectorscomprising any of the nucleic acid sequences described herein, as wellas of cell hosts and transgenic non human animals comprising thesenucleic acid sequences or recombinant vectors.

The invention is also directed to methods for the screening ofsubstances or molecules that interact with any of the presentpolypeptides or that modulate the activity of any of the presentpolypeptides.

As such, in one aspect, the present invention provides an isolated,purified, or recombinant polynucleotide comprising a nucleic acidsequence that is at least about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%,98%, 99%, or 100% identical to a contiguous span of at least 12, 25, 50,100, 250, 500, 1000, or more nucleotides of any of the nucleic acidsequences shown as SEQ ID NOs:2-9 or 14-21, or a sequence complementaryto any of these sequences. In another aspect, the present inventionprovides an isolated, purified, or recombinant polynucleotide comprisinga nucleic acid sequence that encodes a functional GlyT1 transporter andwhich specifically hybridizes under stringent or moderate conditionswith any of the nucleic acid sequences shown as SEQ ID NOs:2-9 or 14-21.

In another aspect, the present invention provides an isolated, purified,or recombinant polynucleotide comprising a nucleic acid sequence that isat least about 70%, 75%, 80%, 85%, 90%, 95% or more identical to any ofthe sequences shown as SEQ ID NOs:14-21, wherein the polynucleotidecomprises a sequence at least about 90%, 95%, 96%, 97%, 98%, 99%, or100% identical to any of the sequences shown as SEQ ID NOs:2-9.

In another aspect, the present invention provides an isolated, purified,or recombinant polynucleotide encoding a glycine transporter, whereinsaid polynucleotide hybridizes under stringent or moderate hybridizationconditions with a nucleic acid comprising any of the sequences shown asSEQ ID NOs:14-21, and wherein said polynucleotide comprises any of thesequences shown as SEQ ID NOs:2-9.

In another aspect, the present invention provides an isolated, purified,or recombinant polynucleotide which encodes a polypeptide comprising anamino acid sequence that is at least about 70%, 75%, 80%, 85%, 90%, 95%,96%, 97%, 98%, 99%, or more identical to a contiguous span of at least6, 12, 25, 50, 100, 200, 300, 400, 500, or more amino acids of any ofSEQ ID NOs:26-33. In one embodiment, the polypeptide comprises any ofthe amino acid sequences shown as SEQ ID NOs:26-33.

In another aspect, the present invention provides a method of producinga GlyT1 polypeptide, said method comprising the following steps: a)providing a host cell comprising a nucleic acid encoding any one of thepolypeptides shown as SEQ ID NO:26-33, operably linked to a promoter; b)cultivating said host cell under conditions conducive to the expressionof said polypeptide; and c) isolating said polypeptide from said hostcell.

In one embodiment, the polynucleotide is attached to a solid support. Inanother embodiment, the polynucleotide further comprises a label. Inanother embodiment, the polynucleotide is operably linked to a promoter.

In another aspect, the present invention provides a biologically activefragment of any of the herein-described polynucleotides.

In another aspect, the present invention provides an array ofpolynucleotides comprising at least one of the herein-describedpolynucleotides. In one embodiment, the array is addressable.

In another aspect, the present invention provides a recombinant vectorcomprising any of the herein-described polynucleotides.

In another aspect, the present invention provides a host cell comprisingany of the herein-described recombinant vectors or polynucleotides.

In another aspect, the present invention provides a non-human hostanimal or mammal comprising any of the herein-described recombinantvectors or polynucleotides.

In another aspect, the present invention provides an isolated, purified,or recombinant polypeptide comprising an amino acid sequence that is atleast about 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or moreidentical to a contiguous span of at least 6, 12, 25, 50, 100, 250, 500,or more amino acids of any of the sequences shown as SEQ ID NOs:26-33.

In one embodiment, the polypeptide comprises any of the sequences shownas SEQ ID NOs:26-33.

In another aspect, the present invention provides an isolated, purified,or recombinant polypeptide, wherein the polypeptide comprises an aminoacid sequence encoded by any of the nucleic acid sequences shown as SEQID NOs:2-9 or 14-21.

In another aspect, the present invention provides a biologically activefragment of any of the herein-described polypeptides.

The invention further relates to methods of making the polypeptides ofthe present invention.

In another aspect, the present invention provides an isolated orpurified antibody capable of selectively binding to anepitope-containing fragment of any of the herein-described polypeptides,such as the polypeptides encoded by any of the sequences shown as SEQ IDNOs: 2-9 or 14-21, or the polypeptides comprising any of the amino acidsequences shown as SEQ ID NOs:26-33

In another aspect, the present invention provides a method of binding ananti-GlyT1 antibody to any of the herein-described polypeptides, e.g.polypeptides encoded by any of SEQ ID NOs:2-9 or 14-21, or comprisingany of the sequences shown as SEQ ID NOs:26-33, said method comprisingcontacting said antibody with said polypeptide under conditions in whichsaid antibody can specifically bind to said polypeptide.

The present invention further relates to transgenic plants or animals,wherein said transgenic plant or animal is transgenic for apolynucleotide of the present invention and expresses a polypeptide ofthe present invention, or in which a polynucleotide of the presentinvention has been specifically disrupted or replaced with an inactiveversion of the polynucleotide, or with a substitute version havingaltered properties.

In another aspect, the present invention provides a diagnostic kitcomprising any of the herein described polynucleotides, polypeptides, orantibodies.

The invention also provides kits, uses and methods for detecting theexpression and/or biological activity of any of the herein-describedGlyT1 variants, e.g., in a biological sample. One such method involvesassaying for expression using the polymerase chain reaction (PCR), e.g.,RT-PCR, to detect mRNA encoding any of the variants. In another method,Northern blot hybridization is used. Alternatively, a method ofdetecting gene expression in a test sample can be accomplished using acompound which binds to any of the herein-described polypeptides, e.g. aGlyT1-specific antibody, preferably a variant-specific anti-GlyT1antibody.

In another aspect, the present invention provides a method of detectingthe expression of a GlyT1 gene within a cell, said method comprising thesteps of: a) contacting said cell or an extract from said cell witheither of: i) a polynucleotide that hybridizes under stringentconditions to any of the herein-described GlyT1 polynucleotides; or ii)a compound that specifically binds to any of the herein-described GlyT1polypeptides; and b) detecting the presence or absence of hybridizationbetween said polynucleotide and an RNA species within said cell orextract, or the presence or absence of binding of said compound to aprotein within said cell or extract; wherein a detection of the presenceof said hybridization or of said binding indicates that said GlyT1 geneis expressed within said cell.

In one embodiment, said polynucleotide is an oligonucleotide primer, andwherein said hybridization is detected by detecting the presence of anamplification product comprising the sequence of said primer. In anotherembodiment, said compound is an anti-GlyT1 antibody.

In another aspect, the present invention provides a method ofidentifying a candidate modulator of a GlyT1 polypeptide, said methodcomprising: a) contacting any of the herein-described GlyT1 polypeptideswith a test compound; and b) determining whether said compoundspecifically binds to said polypeptide; wherein a detection that saidcompound specifically binds to said polypeptide indicates that saidcompound is a candidate modulator of said GlyT1 polypeptide.

In one embodiment, the method further comprises testing the activity ofsaid GlyT1 polypeptide in the presence of said candidate modulator,wherein a difference in the activity of said GlyT1 polypeptide in thepresence of said candidate modulator in comparison to the activity inthe absence of said candidate modulator indicates that the candidatemodulator is a modulator of said GlyT1 polypeptide.

In another aspect, the present invention provides a method ofidentifying a modulator of a GlyT1 polypeptide, said method comprising:a) contacting any of the herein-described polypeptides with a testcompound; and b) detecting the activity of said polypeptide in thepresence and absence of said compound; wherein a detection of adifference in said activity in the presence of said compound incomparison to the activity in the absence of said compound indicatesthat said compound is a modulator of said GlyT1 polypeptide.

In one embodiment of these methods, said polypeptide is present in acell or cell membrane, and wherein said activity comprises glycinetransport activity.

In another aspect, the present invention provides a method for thepreparation of a pharmaceutical composition comprising a) identifying amodulator of a GlyT1 polypeptide using any of the herein-describedmethods; and b) combining said modulator with a physiologicallyacceptable carrier.

The present invention also relates to diagnostic methods and uses of thepresent polynucleotides and polypeptides for identifying humans ornon-human animals having elevated or reduced levels of expression of anyone or combination of the herein-described variants, which individualsare likely to benefit from therapies to suppress or enhance theexpression of the variant or variants, respectively, and to methods ofidentifying individuals or non-human animals at increased risk fordeveloping, or at present having, diseases or disorders associated withexpression or biological activity of any one or combination of theherein-described variants.

The present invention also relates to kits, uses and methods forscreening compounds for their ability to modulate (e.g. increase orinhibit) the activity or expression of any of the present variants. Usesof such compounds are also within the scope of the present invention.

The present invention also relates to pharmaceutical or physiologicallyacceptable compositions comprising, an active agent, the polypeptides,polynucleotides or antibodies of the present invention, as well as,typically, a pharmaceutically acceptable carrier.

The present invention also provides the use of any of theherein-described GlyT1 polynucleotides, polypeptides, antibodies,modulators, or kits, in the diagnosis or treatment of any disorder,preferably a neurological or psychiatric disorder such as schizophrenia,or in the preparation of a medicament for the treatment of any disorderincluding neurological or psychiatric disorders such as schizophrenia.

In another aspect, the present invention provides a computer readablemedium having stored thereon a sequence selected from the groupconsisting of a nucleic acid code comprising a contiguous span of atleast 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200,500, or 1000 nucleotides of any of the sequences shown as SEQ ID NOs:2-9or 14-21.

In another aspect, the present invention provides a computer readablemedium having stored thereon a sequence consisting of a polypeptide codecomprising a contiguous span of at least 6, 8, 10, 12, 15, 20, 25, 30,40, 50, or 100 amino acids of any of the amino acid sequences shown asSEQ ID NOs:26-33.

In another aspect, the present invention provides a computer systemcomprising a processor and a data storage device, wherein said datastorage device comprises any of the herein-described computer readablemedia.

In one embodiment, the computer system further comprises a sequencecomparer and a data storage device having reference sequences storedthereon. In another embodiment, the computer system further comprises anidentifier which identifies features in said sequence.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 shows the 8 novel GlyT1 splice variants of the present invention.The exon structure is shown for each variant in comparison with thestructure for the previously described variant of Genbank accession no.S70612. The genomic structure for all of the exons is also indicatedwithin the genomic sequence presented in Genbank accession no. AC005038.

BRIEF DESCRIPTION OF THE SEQUENCES PROVIDED IN THE SEQUENCE LISTING

SEQ ID NO: 1 provides genomic sequence of the GlyT1 gene, comprising the5′ regulatory region (upstream untranscribed region), the exons andintrons, and the 3′ regulatory region (downstream untranscribed region).

SEQ ID NOs: 2-9 provide novel exons of the GlyT1 gene.

SEQ ID NOs: 10-13 provide DNA sequences encoding previously known GlyT1variants (GlyT1a, GlyT1b, GlyT1c, GlyT1d).

SEQ ID NOs:14-21 provide novel DNA sequences encoding novel GlyT1variants (Genset variants 1-8).

SEQ ID NOs:22-25 provide protein sequences of previously known GlyT1variants.

SEQ ID NOs:26-33 provide protein sequences of the novel Genset GlyT1variants.

SEQ ID NOs:34 and 35 provide the sequences of oligonucleotides SLC6A9LFand SLC6A9LR, which were used in the cloning of the presently-providedGlyT1 variants.

SEQ ID NO:36 provides a primer sequence containing the additional PU 5′sequence described further in Example 2.

SEQ ID NO:37 provides a primer sequence containing the additional RP 5′sequence described further in Example 2.

In accordance with the regulations relating to Sequence Listings, thefollowing codes have been used in the Sequence Listing to indicate thelocations of biallelic markers within the sequences and to identify eachof the alleles present at the polymorphic base. The code “r” in thesequences indicates that one allele of the polymorphic base is aguanine, while the other allele is an adenine. The code “y” in thesequences indicates that one allele of the polymorphic base is athymine, while the other allele is a cytosine. The code “m” in thesequences indicates that one allele of the polymorphic base is anadenine, while the other allele is an cytosine. The code “k” in thesequences indicates that one allele of the polymorphic base is aguanine, while the other allele is a thymine. The code “s” in thesequences indicates that one allele of the polymorphic base is aguanine, while the other allele is a cytosine. The code “w” in thesequences indicates that one allele of the polymorphic base is anadenine, while the other allele is an thymine.

In some instances, the polymorphic bases of biallelic markers alter theidentity of one or more amino acids in the encoded polypeptide. This isindicated in the accompanying Sequence Listing by use of the featureVARIANT, placement of an Xaa at the position of the polymorphic aminoacid, and definition of Xaa as the two alternative amino acids. Forexample if one allele of a biallelic marker is the codon CAC, whichencodes histidine, while the other allele of the biallelic marker isCAA, which encodes glutamine, the Sequence Listing for the encodedpolypeptide will contain an Xaa at the location of the polymorphic aminoacid. In this instance, Xaa would be defined as being histidine orglutamine.

In other instances, Xaa may indicate an amino acid whose identity isunknown because of nucleotide sequence ambiguity. In this instance, thefeature UNSURE is used, placement of an Xaa at the position of theunknown amino acid and definition of Xaa as being any of the 20 aminoacids or a limited number of amino acids suggested by the genetic code.

DETAILED DESCRIPTION

The present invention concerns novel polynucleotides and polypeptidesrelated to the GlyT1 gene. Oligonucleotide probes and primershybridizing specifically with these novel polynucleotides are also partof the invention. A further object of the invention consists ofrecombinant vectors comprising any of the nucleic acid sequencesdescribed in the present invention, as well as cell hosts comprisingsaid nucleic acid sequences or recombinant vectors. The invention alsoencompasses methods of screening for molecules which modulate theactivity of the present proteins. The invention also deals withantibodies directed specifically against the present polypeptides, whichare useful as diagnostic reagents.

DEFINITIONS

Before describing the invention in greater detail, the followingdefinitions are set forth to illustrate and define the meaning and scopeof the terms used to describe the invention herein.

The terms “GlyT1 gene”, when used herein, encompasses genomic, mRNA andcDNA sequences encoding the GlyT1 protein, including the untranslatedregulatory regions of the genomic DNA, and including any of theherein-described variants.

A GlyT1 “variant” can refer to any GlyT1 polynucleotide or polypeptide,in particular a GlyT1 polypeptide or polynucleotide differing at one ormore nucleotides or amino acids from other GlyT1 sequences, especiallydiffering from other GlyT1 sequences as a result of differential mRNAsplicing. Most specifically, GlyT1 variants refer to the novel GlyT1polynucleotides and polypeptides shown here as SEQ ID NOs: 14-21 and26-33, and to conservatively substituted relatives thereof.

The term “heterologous protein”, when used herein, is intended todesignate any protein or polypeptide other than a GlyT1 protein ofinterest.

A “functional” glycine transporter refers to any polypeptide with one ormore detectable activities of glycine transporters such as full-lengthGlyT1, such as the ability to transport glycine across a membrane in invitro or in vivo assays, and also including glycine binding, neuronalactivation in cells expressing the transporter, interaction withadditional ligands, etc. Examples of such assays can be found in thesection entitled, “Methods for identifying modulators of GlyT1activity.”

The term “isolated” requires that the material be removed from itsoriginal environment (e.g., the natural environment if it is naturallyoccurring). For example, a naturally-occurring polynucleotide orpolypeptide present in a living animal is not isolated, but the samepolynucleotide or DNA or polypeptide, separated from some or all of thecoexisting materials in the natural system, is isolated. Such qpolynucleotide could be part of a vector and/or such polynucleotide orpolypeptide could be part of a composition, and still be isolated inthat the vector or composition is not part of its natural environment.

For example, a naturally-occurring polynucleotide present in a livinganimal is not isolated, but the same polynucleotide, separated from someor all of the coexisting materials in the natural system, is isolated.Specifically excluded from the definition of “isolated” are:naturally-occurring chromosomes (such as chromosome spreads), artificialchromosome libraries, genomic libraries, and cDNA libraries that existeither as an in vitro nucleic acid preparation or as atransfected/transformed host cell preparation, wherein the host cellsare either an in vitro heterogeneous preparation or plated as aheterogeneous population of single colonies. Also specifically excludedare the above libraries wherein a specified polynucleotide makes up lessthan 5% of the number of nucleic acid inserts in the vector molecules.Further specifically excluded are whole cell genomic DNA or whole cellRNA preparations (including said whole cell preparations which aremechanically sheared or enzymatically digested). Further specificallyexcluded are the above whole cell preparations as either an in vitropreparation or as a heterogeneous mixture separated by electrophoresis(including blot transfers of the same) wherein the polynucleotide of theinvention has not further been separated from the heterologouspolynucleotides in the electrophoresis medium (e.g., further separatingby excising a single band from a heterogeneous band population in anagarose gel or nylon blot).

The term “purified” does not require absolute purity; rather, it isintended as a relative definition. Purification of starting material ornatural material to at least one order of magnitude, preferably two orthree orders, and more preferably four or five orders of magnitude isexpressly contemplated. As an example, purification from 0.1%concentration to 10% concentration is two orders of magnitude. Toillustrate, individual cDNA clones isolated from a cDNA library havebeen conventionally purified to electrophoretic homogeneity. Thesequences obtained from these clones could not be obtained directlyeither from the library or from total human DNA. The cDNA clones are notnaturally occurring as such, but rather are obtained via manipulation ofa partially purified naturally occurring substance (messenger RNA). Theconversion of mRNA into a cDNA library involves the creation of asynthetic substance (cDNA) and pure individual cDNA clones can beisolated from the synthetic library by clonal selection. Thus, creatinga cDNA library from messenger RNA and subsequently isolating individualclones from that library results in an approximately 104-106 foldpurification of the native message.

The term “purified” is further used herein to describe a polypeptide orpolynucleotide of the invention which has been separated from othercompounds including, but not limited to, polypeptides orpolynucleotides, carbohydrates, lipids, etc. The term “purified” may beused to specify the separation of monomeric polypeptides of theinvention from oligomeric forms such as homo- or hetero-dimers, trimers,etc. The term “purified” may also be used to specify the separation ofcovalently closed polynucleotides from linear polynucleotides. Apolynucleotide is substantially pure when at least about 50%, preferably60 to 75% of a sample exhibits a single polynucleotide sequence andconformation (linear versus covalently close). A substantially purepolypeptide or polynucleotide typically comprises about 50%, preferably60 to 90% weight/weight of a polypeptide or polynucleotide sample,respectively, more usually about 95%, and preferably is over about 99%pure. Polypeptide and polynucleotide purity, or homogeneity, isindicated by a number of means well known in the art, such as agarose orpolyacrylamide gel electrophoresis of a sample, followed by visualizinga single band upon staining the gel. For certain purposes higherresolution can be provided by using HPLC or other means well known inthe art. As an alternative embodiment, purification of the polypeptidesand polynucleotides of the present invention may be expressed as “atleast” a percent purity relative to heterologous polypeptides andpolynucleotides (DNA, RNA or both). As a preferred embodiment, thepolypeptides and polynucleotides of the present invention are at least;10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 96%, 98%, 99%, or100% pure relative to heterologous polypeptides and polynucleotides,respectively. As a further preferred embodiment the polypeptides andpolynucleotides have a purity ranging from any number, to the thousandthposition, between 90% and 100% (e.g., a polypeptide or polynucleotide atleast 99.995% pure) relative to either heterologous polypeptides orpolynucleotides, respectively, or as a weight/weight ratio relative toall compounds and molecules other than those existing in the carrier.Each number representing a percent purity, to the thousandth position,may be claimed as individual species of purity.

The term “polypeptide” refers to a polymer of amino acids without regardto the length of the polymer; thus, peptides, oligopeptides, andproteins are included within the definition of polypeptide. This termalso does not specify or exclude post-expression modifications ofpolypeptides, for example, polypeptides which include the covalentattachment of glycosyl groups, acetyl groups, phosphate groups, lipidgroups and the like are expressly encompassed by the term polypeptide.Also included within the definition are polypeptides which contain oneor more analogs of an amino acid (including, for example, non-naturallyoccurring amino acids, amino acids which only occur naturally in anunrelated biological system, modified amino acids from mammalian systemsetc.), polypeptides with substituted linkages, as well as othermodifications known in the art, both naturally occurring andnon-naturally occurring.

The term “recombinant polypeptide” is used herein to refer topolypeptides that have been artificially designed and which comprise atleast two polypeptide sequences that are not found as contiguouspolypeptide sequences in their initial natural environment, or to referto polypeptides which have been expressed from a recombinantpolynucleotide, i.e. using recombinant DNA methods.

As used herein, the term “non-human animal” refers to any non-humanvertebrate, birds and more usually mammals, preferably primates, farmanimals such as swine, goats, sheep, donkeys, and horses, rabbits orrodents, more preferably rats or mice. As used herein, the term “animal”is used to refer to any vertebrate, preferable a mammal. Both the terms“animal” and “mammal” expressly embrace human subjects unless precededwith the term “non-human”.

As used herein, the term “antibody” refers to a polypeptide or group ofpolypeptides which are comprised of at least one binding domain, wherean antibody binding domain is formed from the folding of variabledomains of an antibody molecule to form three-dimensional binding spaceswith an internal surface shape and charge distribution complementary tothe features of an antigenic determinant of an antigen, which allows animmunological reaction with the antigen. Antibodies include recombinantproteins comprising the binding domains, as wells as fragments,including Fab, Fab′, F(ab)2, and F(ab′)2 fragments.

As used herein, an “antigenic determinant” is the portion of an antigenmolecule, in this case a GLYT1 polypeptide, that determines thespecificity of the antigen-antibody reaction. An “epitope” refers to anantigenic determinant of a polypeptide. An epitope can comprise as fewas 3 amino acids in a spatial conformation which is unique to theepitope. Generally an epitope comprises at least 6 such amino acids, andmore usually at least 8-10 such amino acids. Methods for determining theamino acids which make up an epitope include x-ray crystallography,2-dimensional nuclear magnetic resonance, and epitope mapping e.g. thePepscan method described by Geysen et al. 1984; PCT Publication No. WO84/03564; and PCT Publication No. WO 84/03506.

Throughout the present specification, the expression “nucleotidesequence” may be employed to designate indifferently a polynucleotide ora nucleic acid. More precisely, the expression “nucleotide sequence”encompasses the nucleic material itself and is thus not restricted tothe sequence information (i.e. the succession of letters chosen amongthe four base letters) that biochemically characterizes a specific DNAor RNA molecule.

As used interchangeably herein, the terms “nucleic acids”,“oligonucleotides”, and “polynucleotides” include RNA, DNA, or RNA/DNAhybrid sequences of more than one nucleotide in either single chain orduplex form. The term “nucleotide” as used herein as an adjective todescribe molecules comprising RNA, DNA, or RNA/DNA hybrid sequences ofany length in single-stranded or duplex form. The term “nucleotide” isalso used herein as a noun to refer to individual nucleotides orvarieties of nucleotides, meaning a molecule, or individual unit in alarger nucleic acid molecule, comprising a purine or pyrimidine, aribose or deoxyribose sugar moiety, and a phosphate group, orphosphodiester linkage in the case of nucleotides within anoligonucleotide or polynucleotide. Although the term “nucleotide” isalso used herein to encompass “modified nucleotides” which comprise atleast one modifications (a) an alternative linking group, (b) ananalogous form of purine, (c) an analogous form of pyrimidine, or (d) ananalogous sugar, for examples of analogous linking groups, purine,pyrimidines, and sugars see for example PCT publication No. WO 95/04064.The polynucleotide sequences of the invention may be prepared by anyknown method, including synthetic, recombinant, ex vivo generation, or acombination thereof, as well as utilizing any purification methods knownin the art.

As used herein, the term “operably linked” refers to a linkage ofpolynucleotide elements in a functional relationship. For instance, apromoter or enhancer is operably linked to a coding sequence if itaffects the transcription of the coding sequence.

The terms “trait” and “phenotype” are used interchangeably herein andrefer to any visible, detectable or otherwise measurable property of anorganism such as symptoms of, or susceptibility to a disease forexample. Typically the terms “trait” or “phenotype” are used herein torefer to symptoms of, or susceptibility to a disease, a beneficialresponse to or side effects related to a treatment. Preferably, saidtrait can be, without being limited to, neurological and psychiatricconditions such as schizophrenia.

The term “allele” is used herein to refer to variants of a nucleotidesequence. A biallelic polymorphism has two forms. Diploid organisms maybe homozygous or heterozygous for an allelic form.

The term “heterozygosity rate” is used herein to refer to the incidenceof individuals in a population which are heterozygous at a particularallele. In a biallelic system, the heterozygosity rate is on averageequal to 2 Pa (1-Pa), where Pa is the frequency of the least commonallele. In order to be useful in genetic studies, a genetic markershould have an adequate level of heterozygosity to allow a reasonableprobability that a randomly selected person will be heterozygous.

The term “genotype” as used herein refers the identity of the allelespresent in an individual or a sample. In the context of the presentinvention, a genotype preferably refers to the description of thebiallelic marker alleles present in an individual or a sample. The term“genotyping” a sample or an individual for a biallelic marker involvesdetermining the specific allele or the specific nucleotide carried by anindividual at a biallelic marker.

The term “mutation” as used herein refers to a difference in DNAsequence between or among different genomes or individuals which has afrequency below 1%.

The term “haplotype” refers to a combination of alleles present in anindividual or a sample. In the context of the present invention, ahaplotype preferably refers to a combination of biallelic marker allelesfound in a given individual and which may be associated with aphenotype.

The term “polymorphism” as used herein refers to the occurrence of twoor more alternative genomic sequences or alleles between or amongdifferent genomes or individuals, “Polymorphic” refers to the conditionin which two or more variants of a specific genomic sequence can befound in a population. A “polymorphic site” is the locus at which thevariation occurs. A single nucleotide polymorphism is the replacement ofone nucleotide by another nucleotide at the polymorphic site. Deletionof a single nucleotide or insertion of a single nucleotide also givesrise to single nucleotide polymorphisms. In the context of the presentinvention, “single nucleotide polymorphism” preferably refers to asingle nucleotide substitution. Typically, between differentindividuals, the polymorphic site may be occupied by two differentnucleotides.

The term “biallelic polymorphism” and “biallelic marker” are usedinterchangeably herein to refer to a single nucleotide polymorphismhaving two alleles at a fairly high frequency in the population. A“biallelic marker allele” refers to the nucleotide variants present at abiallelic marker site. Typically, the frequency of the less commonallele of the biallelic markers of the present invention has beenvalidated to be greater than 1%, preferably the frequency is greaterthan 10%, more preferably the frequency is at least 20% (i.e.heterozygosity rate of at least 0.32), even more preferably thefrequency is at least 30% (i.e. heterozygosity rate of at least 0.42). Abiallelic marker wherein the frequency of the less common allele is 30%or more is termed a “high quality biallelic marker”.

The location of nucleotides in a polynucleotide with respect to thecenter of the polynucleotide are described herein in the followingmanner. When a polynucleotide has an odd number of nucleotides, thenucleotide at an equal distance from the 3′ and 5′ ends of thepolynucleotide is considered to be “at the center” of thepolynucleotide, and any nucleotide immediately adjacent to thenucleotide at the center, or the nucleotide at the center itself isconsidered to be “within 1 nucleotide of the center.” With an odd numberof nucleotides in a polynucleotide any of the five nucleotides positionsin the middle of the polynucleotide would be considered to be within 2nucleotides of the center, and so on. When a polynucleotide has an evennumber of nucleotides, there would be a bond and not a nucleotide at thecenter of the polynucleotide. Thus, either of the two centralnucleotides would be considered to be “within 1 nucleotide of thecenter” and any of the four nucleotides in the middle of thepolynucleotide would be considered to be “within 2 nucleotides of thecenter”, and so on. For polymorphisms which involve the substitution,insertion or deletion of 1 or more nucleotides, the polymorphism, alleleor biallelic marker is “at the center” of a polynucleotide if thedifference between the distance from the substituted, inserted, ordeleted polynucleotides of the polymorphism and the 3′ end of thepolynucleotide, and the distance from the substituted, inserted, ordeleted polynucleotides of the polymorphism and the 5′ end of thepolynucleotide is zero or one nucleotide. If this difference is 0 to 3,then the polymorphism is considered to be “within 1 nucleotide of thecenter.” If the difference is 0 to 5, the polymorphism is considered tobe “within 2 nucleotides of the center.” If the difference is 0 to 7,the polymorphism is considered to be “within 3 nucleotides of thecenter,” and so on.

The term “upstream” is used herein to refer to a location which istoward the 5′ end of the polynucleotide from a specific reference point,and “downstream” refers to locations in the 3′ direction.

The terms “base paired” and “Watson & Crick base paired” are usedinterchangeably herein to refer to nucleotides which can be hydrogenbonded to one another by virtue of their sequence identities in a mannerlike that found in double-helical DNA with thymine or uracil residueslinked to adenine residues by two hydrogen bonds and cytosine andguanine residues linked by three hydrogen bonds (See Stryer, L.,Biochemistry, 4th edition, 1995).

The terms “complementary” or “complement thereof” are used herein torefer to the sequences of polynucleotides which is capable of formingWatson & Crick base pairing with another specified polynucleotidethroughout the entirety of the complementary region. For the purpose ofthe present invention, a first polynucleotide is deemed to becomplementary to a second polynucleotide when each base in the firstpolynucleotide is paired with its complementary base. Complementarybases are, generally, A and T (or A and U), or C and G. “Complement” isused herein as a synonym from “complementary polynucleotide”,“complementary nucleic acid” and “complementary nucleotide sequence”.These terms are applied to pairs of polynucleotides based solely upontheir sequences and not any particular set of conditions under which thetwo polynucleotides would actually bind.

Variants and Fragments 1—Polynucleotides

The invention also relates to variants and fragments of thepolynucleotides described herein.

Variants of polynucleotides, as the term is used herein, arepolynucleotides that differ from a reference polynucleotide. A variantof a polynucleotide may be a naturally occurring variant such as anaturally occurring allelic variant, or it may be a variant that is notknown to occur naturally. Such non-naturally occurring variants of thepolynucleotide may be made by mutagenesis techniques, including thoseapplied to polynucleotides, cells or organisms. Generally, differencesare limited so that the nucleotide sequences of the reference and thevariant are closely similar overall and, in many regions, identical.

Variants of polynucleotides according to the invention include, withoutbeing limited to, nucleotide sequences which are at least 95% identicalto a polynucleotide selected from the group consisting of the nucleotidesequences of SEQ ID Nos:2-9 or 14-21, or to any polynucleotide fragmentof at least 12 consecutive nucleotides of a polynucleotide selected fromthe group consisting of the nucleotide sequences of SEQ ID Nos 2-9 or14-21, and preferably at least 99% identical, more particularly at least99.5% identical, and most preferably at least 99.8% identical to apolynucleotide selected from the group consisting of the nucleotidesequences of SEQ ID Nos 2-9 or 14-21, or to any polynucleotide fragmentof at least 12 consecutive nucleotides of a polynucleotide selected fromthe group consisting of the nucleotide sequences of SEQ ID Nos: 2-9 or14-21.

In particular, the present invention comprises polynucleotide andpolypeptide sequences spanning regions comprising biallelic markerswithin the GlyT1 gene. Methods of identifying such markers, and of usingthem for diagnosis, gene mapping, association studies, and otherapplications are well known to those of skill in the art.

Nucleotide changes present in a variant polynucleotide may be silent,which means that they do not alter the amino acids encoded by thepolynucleotide. However, nucleotide changes may also result in aminoacid substitutions, additions, deletions, fusions and truncations in thepolypeptide encoded by the reference sequence. The substitutions,deletions or additions may involve one or more nucleotides. The variantsmay be altered in coding or non-coding regions or both. Alterations inthe coding regions may produce conservative or non-conservative aminoacid substitutions, deletions or additions.

In the context of the present invention, particularly preferredembodiments are those in which the polynucleotides encode polypeptideswhich retain substantially the same biological function or activity asthe mature GlyT1 protein, or those in which the polynucleotides encodepolypeptides which maintain or increase a particular biologicalactivity, while reducing a second biological activity

A polynucleotide fragment is a polynucleotide having a sequence that isentirely the same as part but not all of a given nucleotide sequence,preferably the nucleotide sequence of a GlyT1 gene, and variantsthereof. The fragment can be a portion of an intron or an exon of aGlyT1 gene. It can also be a portion of the regulatory regions of GlyT1.

Such fragments may be “free-standing”, i.e. not part of or fused toother polynucleotides, or they may be comprised within a single largerpolynucleotide of which they form a part or region. Indeed, several ofthese fragments may be present within a single larger polynucleotide.

Optionally, such fragments may consist of, or consist essentially of acontiguous span of at least 8, 10, 12, 15, 18, 20, 25, 35, 40, 50, 70,80, 100, 250, 500 or 1000 nucleotides in length.

2—Polypeptides

The invention also relates to variants, fragments, analogs andderivatives of the polypeptides described herein, including mutatedGlyT1 proteins.

The variant may be 1) one in which one or more of the amino acidresidues are substituted with a conserved or non-conserved amino acidresidue and such substituted amino acid residue may or may not be oneencoded by the genetic code, or 2) one in which one or more of the aminoacid residues includes a substituent group, or 3) one in which themutated GlyT1 is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or 4) one in which the additional amino acids are fused to themutated GlyT1, such as a leader or secretory sequence or a sequencewhich is employed for purification of the mutated GlyT1 or a preproteinsequence. Such variants are deemed to be within the scope of thoseskilled in the art.

A polypeptide fragment is a polypeptide having a sequence that entirelyis the same as part but not all of a given polypeptide sequence,preferably a polypeptide encoded by a GlyT1 gene and variants thereof.

In the case of an amino acid substitution in the amino acid sequence ofa polypeptide according to the invention, one or several amino acids canbe replaced by “equivalent” amino acids. The expression “equivalent”amino acid is used herein to designate any amino acid that may besubstituted for one of the amino acids having similar properties, suchthat one skilled in the art of peptide chemistry would expect thesecondary structure and hydropathic nature of the polypeptide to besubstantially unchanged. Generally, the following groups of amino acidsrepresent equivalent changes: (1) Ala, Pro, Gly, Glu, Asp, Gln, Asn,Ser, Thr; (2) Cys, Ser, Tyr, Thr; (3) Val, Ile, Leu, Met, Ala, Phe; (4)Lys, Arg, H is; (5) Phe, Tyr, Trp, His.

A specific embodiment of a modified GlyT1 peptide molecule of interestaccording to the present invention, includes, but is not limited to, apeptide molecule which is resistant to proteolysis, is a peptide inwhich the —CONH— peptide bond is modified and replaced by a (CH2NH)reduced bond, a (NHCO) retro inverso bond, a (CH2-O) methylene-oxy bond,a (CH2-S) thiomethylene bond, a (CH2CH2) carba bond, a (CO—CH2)cetomethylene bond, a (CHOH—CH2) hydroxyethylene bond), a (N—N) bound, aE-alcene bond or also a —CH═CH— bond. The invention also encompasses ahuman GlyT1 polypeptide or a fragment or a variant thereof in which atleast one peptide bond has been modified as described above.

Such fragments may be “free-standing”, i.e. not part of or fused toother polypeptides, or they may be comprised within a single largerpolypeptide of which they form a part or region.

However, several fragments may be comprised within a single largerpolypeptide. As representative examples of polypeptide fragments of theinvention, there may be mentioned those which have from about 5, 6, 7,8, 9 or 10 to 15, 10 to 20, 15 to 40, or 30 to 55 amino acids long. Inone embodiment, the fragments contain at least one amino acid mutationin the GlyT1 protein.

Identity Between Nucleic Acids or Polypeptides

The terms “percentage of sequence identity” and “percentage homology”are used interchangeably herein to refer to comparisons amongpolynucleotides and polypeptides, and are determined by comparing twooptimally aligned sequences over a comparison window, wherein theportion of the polynucleotide or polypeptide sequence in the comparisonwindow may comprise additions or deletions (i.e., gaps) as compared tothe reference sequence (which does not comprise additions or deletions)for optimal alignment of the two sequences. The percentage is calculatedby determining the number of positions at which the identical nucleicacid base or amino acid residue occurs in both sequences to yield thenumber of matched positions, dividing the number of matched positions bythe total number of positions in the window of comparison andmultiplying the result by 100 to yield the percentage of sequenceidentity. Homology is evaluated using any of the variety of sequencecomparison algorithms and programs known in the art. Such algorithms andprograms include, but are by no means limited to, TBLASTN, BLASTP,FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, 1988; Altschul et al.,1990; Thompson et al., 1994; Higgins et al., 1996; Altschul et al.,1990; Altschul et al., 1993). In a particularly preferred embodiment,protein and nucleic acid sequence homologies are evaluated using theBasic Local Alignment Search Tool (“BLAST”) which is well known in theart (see, e.g., Karlin and Altschul, 1990; Altschul et al., 1990, 1993,1997). In particular, five specific BLAST programs are used to performthe following task:

(1) BLASTP and BLAST3 compare an amino acid query sequence against aprotein sequence database;

(2) BLASTN compares a nucleotide query sequence against a nucleotidesequence database;

(3) BLASTX compares the six-frame conceptual translation products of aquery nucleotide sequence (both strands) against a protein sequencedatabase;

(4) TBLASTN compares a query protein sequence against a nucleotidesequence database translated in all six reading frames (both strands);and

(5) TBLASTX compares the six-frame translations of a nucleotide querysequence against the six-frame translations of a nucleotide sequencedatabase.

The BLAST programs identify homologous sequences by identifying similarsegments, which are referred to herein as “high-scoring segment pairs,”between a query amino or nucleic acid sequence and a test sequence whichis preferably obtained from a protein or nucleic acid sequence database.High-scoring segment pairs are preferably identified (i.e., aligned) bymeans of a scoring matrix, many of which are known in the art.Preferably, the scoring matrix used is the BLOSUM62 matrix (Gonnet etal., 1992; Henikoff and Henikoff, 1993). Less preferably, the PAM orPAM250 matrices may also be used (see, e.g., Schwartz and Dayhoff, eds.,1978). The BLAST programs evaluate the statistical significance of allhigh-scoring segment pairs identified, and preferably selects thosesegments which satisfy a user-specified threshold of significance, suchas a user-specified percent homology. Preferably, the statisticalsignificance of a high-scoring segment pair is evaluated using thestatistical significance formula of Karlin (see, e.g., Karlin andAltschul, 1990).

The BLAST programs may be used with the default parameters or withmodified parameters provided by the user.

Stringent Hybridization Conditions

For the purpose of defining such a hybridizing nucleic acid according tothe invention, the stringent hybridization conditions are thefollowings:

The hybridization step is realized at 65° C. in the presence of 6×SSCbuffer, 5×Denhardt's solution, 0.5% SDS and 100 μg/ml of salmon spermDNA.

The hybridization step is followed by four washing steps:

-   -   two washings during 5 min, preferably at 65° C. in a 2×SSC and        0.1% SDS buffer;    -   one washing during 30 min, preferably at 65° C. in a 2×SSC and        0.1% SDS buffer,    -   one washing during 10 min, preferably at 65° C. in a 0.1×SSC and        0.1% SDS buffer,        these hybridization conditions being suitable for a nucleic acid        molecule of about 20 nucleotides in length. There is no need to        say that the hybridization conditions described above are to be        adapted according to the length of the desired nucleic acid,        following techniques well known to the one skilled in the art.        Suitable hybridization conditions may for example be adapted        according to the teachings disclosed in the book of Hames and        Higgins (1985).

GlyT1 cDNA Sequences

The expression of the GlyT1 gene has been shown to lead to theproduction of a number of distinct mRNA species, the novel nucleic acidsequences of eight of which are set forth herein as SEQ ID Nos: 14-21.

Another object of the invention is a purified, isolated, or recombinantnucleic acid comprising the nucleotide sequence of SEQ ID Nos:14-21,complementary sequences thereto, as well as allelic variants, andfragments thereof. Moreover, preferred polynucleotides of the inventioninclude purified, isolated, or recombinant GlyT1 cDNAs consisting of,consisting essentially of, or comprising the sequence of SEQ ID Nos:2-9.Particularly preferred nucleic acids of the invention include isolated,purified, or recombinant polynucleotides comprising a contiguous span ofat least 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150,200, 500, or 1000 nucleotides of SEQ ID Nos:2-9 or 14-21, or thecomplements thereof.

The invention also pertains to a purified or isolated nucleic acidcomprising a polynucleotide having at least 95% nucleotide identity witha polynucleotide of SEQ ID Nos:2-9 or 14-21, advantageously 99%nucleotide identity, preferably 99.5% nucleotide identity and mostpreferably 99.8% nucleotide identity with a polynucleotide of SEQ IDNos: 2-9 or 14-21, or a sequence complementary thereto or a biologicallyactive fragment thereof.

Another object of the invention relates to purified, isolated orrecombinant nucleic acids comprising a polynucleotide that hybridizes,under the stringent hybridization conditions defined herein, with apolynucleotide comprising a sequence of SEQ ID Nos: 2-9 or 14-21, or asequence complementary thereto, or a variant thereof or a biologicallyactive fragment thereof.

The novel cDNAs of the present invention comprise novel combinations ofpreviously identified exons, as well as novel exons. For example, TableI provides a list of the exons present in previously-identified variantsGlyT1a-GlyT1d, as well as the presently provided novel variants. SEQ IDNO: 1 provides the genomic DNA sequence of the GlyT1 gene, and notes thepositions of each of the herein-referenced exons. The exon structure ofthe novel variants is also presented in FIG. 1.

TABLE I Variants of the GlyT1 gene Size of Encoded Variant Exonconfiguration protein (aa) 1a 1, 2, 5-16 633 1b 3, 5-16 638 1c 3-16 6921d 1a, 2, 4-16 687 Genset Variant 1 3, 4d, 5-16 184 Genset Variant 2 3,5a, 6-16 125 Genset Variant 3 3, 6-16 64 Genset Variant 4 3, 5-7, 7bis,8-16 229 Genset Variant 5 3, 4ter, 5-12, 13a, 14-16 94 Genset Variant 63, 5-12, 13a, 14-16 456 Genset Variant 7 3, 5-14, 15a, 16 550 GensetVariant 8 3, 4bis, 5-16 188

The cDNA of SEQ ID No: 14 (Genset variant 1) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-789, and a 3′-UTR region starting from the nucleotide at position790 and ending at the nucleotide at position 2265. The protein encodedby this cDNA comprises 184 amino acids and is shown as SEQ ID NO:26.

The cDNA of SEQ ID No: 15 (Genset variant 2) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-612, and a 3′-UTR region starting from the nucleotide at position613 and ending at the nucleotide at position 2088. The protein encodedby this cDNA comprises 125 amino acids and is shown as SEQ ID NO:27.

The cDNA of SEQ ID No: 16 (Genset variant 3) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-429, and a 3′-UTR region starting from the nucleotide at position430 and ending at the nucleotide at position 2014. The protein encodedby this cDNA comprises 64 amino acids and is shown as SEQ ID NO:28.

The cDNA of SEQ ID No: 17 (Genset variant 4) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-924, and a 3′-UTR region starting from the nucleotide at position925 and ending at the nucleotide at position 2242. The protein encodedby this cDNA comprises 229 amino acids and is shown as SEQ ID NO:29.

The cDNA of SEQ ID No: 18 (Genset variant 5) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-519, and a 3′-UTR region starting from the nucleotide at position520 and ending at the nucleotide at position 2322. The protein encodedby this cDNA comprises 94 amino acids and is shown as SEQ ID NO:30.

The cDNA of SEQ ID No: 19 (Genset variant 6) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-1605, and a 3′-UTR region starting from the nucleotide at position1606 and ending at the nucleotide at position 2167. The protein encodedby this cDNA comprises 456 amino acids and is shown as SEQ ID NO:31.

The cDNA of SEQ ID No: 20 (Genset variant 7) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-1887, and a 3′-UTR region starting from the nucleotide at position1888 and ending at the nucleotide at position 2371. The protein encodedby this cDNA comprises 550 amino acids and is shown as SEQ ID NO:32.

The cDNA of SEQ ID No: 21 (Genset variant 8) includes a 5′-UTR regionstarting from the nucleotide at position 1 and ending at the nucleotidein position 234, an open reading frame spanning the nucleotide positions235-801, and a 3′-UTR region starting from the nucleotide at position802 and ending at the nucleotide at position 2277. The protein encodedby this cDNA comprises 188 amino acids and is shown as SEQ ID NO:33.

Consequently, the invention concerns a purified, isolated, and/orrecombinant nucleic acid comprising a nucleotide sequence of the 5′UTRof any of the herein-provided GlyT1 cDNAs, a sequence complementarythereto, or an allelic variant thereof. The invention also concerns apurified, isolated, and/or recombinant nucleic acid comprising anucleotide sequence of the 3′UTR of any of the herein-provided GlyT1cDNAs, a sequence complementary thereto, or an allelic variant thereof.

While this section is entitled “GLYT1 cDNA Sequences,” it should benoted that nucleic acid fragments of any size and sequence may also becomprised by the polynucleotides described in this section, flanking thegenomic sequences of GLYT1 on either side or between two or more suchgenomic sequences.

Coding Regions

The open reading frames of the novel GlyT1 cDNAs provided herein arecontained in the corresponding mRNAs of SEQ ID Nos:14-21, as outlined inthe previous section. The present invention also embodies isolated,purified, and/or recombinant polynucleotides which encode a polypeptidecomprising a contiguous span of at least 6 amino acids, preferably atleast 8 or 10 amino acids, more preferably at least 12, 15, 20, 25, 30,40, 50, 100, 200 or more amino acids of any of SEQ ID Nos:26-33.

Certain of the present novel GlyT1 cDNAs comprise novel exons, which areshown as SEQ ID Nos:2-9. Thus, the present invention also providespurified, isolated, or recombinant polynucleotides that comprise anucleotide sequence of SEQ ID Nos: 2-9, complementary sequences thereto,as well as allelic variants and fragments thereof. Particularlypreferred nucleic acids of the invention include isolated, purified, orrecombinant polynucleotides comprising a contiguous span of at least 12,15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300, ormore nucleotides of SEQ ID Nos:2-9, the complements thereof, or whichcomprise a nucleotide sequence that is at least 70%, 75%, 80%, 85%, 90%,95%, 96%, 97%, 98%, 99%, or more identical to any of the sequences shownas SEQ ID NOs:2-9. In a preferred embodiment, the present inventionprovides a nucleic acid sequence that is at least about 70%, 75%, 80%,85%, 90%, 95%, 96%, 97%, 98%, 99%, or more identical to any of thesequences shown as SEQ ID NOs 14-21, or which hybridize under stringentor moderate conditions to any of the sequences shown as SEQ ID NOs:14-21, wherein the nucleic acid sequence comprises any of the sequencesshown as SEQ ID NOs:2-9.

Any of the above-disclosed polynucleotides containing a coding sequenceof the GlyT1 gene may be expressed in a desired host cell or a desiredhost organism, when the polynucleotide is placed under the control ofsuitable expression signals. The expression signals may be either theexpression signals contained in the regulatory regions in the GlyT1 geneof the invention, or, in contrast, the signals may be exogenousregulatory nucleic sequences. Such a polynucleotide, when placed underthe suitable expression signals, may also be inserted in a vector forits expression and/or amplification.

Regulatory Sequences of GlyT1

As mentioned, the genomic sequence of the GlyT1 gene contains regulatorysequences both in the non-coding 5′-flanking region and in thenon-coding 3′-flanking region that border the coding regions containingthe exons of the various cDNAs. The positions of these 5′-regulatorysequence of the novel GlyT1 cDNAs are described in the section entitled,“GlyT1 cDNA sequences,” supra.

Biologically active polynucleotide fragments or variants of any of theherein described novel cDNAs (e.g. the 5′UTRs or 3′UTRs) can bedetected, e.g., by inserting a candidate sequence into a recombinantvector carrying a detectable marker gene (i.e. beta galactosidase,chloramphenicol acetyl transferase, etc.) (see, e.g., Sambrook et al.(1989)).

Polynucleotides derived from any of these 5′ and 3′ regulatory regionsare useful, inter alia, in the detection of at least a copy of any ofthe nucleotide sequences of SEQ ID No: 1 or 14-21, or a fragmentthereof, in a test sample. Polynucleotides carrying the regulatoryelements located at the 5′ end and at the 3′ end of the GLYT1 codingregion may also be used to control the transcriptional and translationalactivity of an heterologous polynucleotide of interest. In addition,polynucleotides from regulatory regions of a GlyT1 gene can be used toidentify GlyT1 or related genes elsewhere in the genome of the samespecies or in the genomes of heterologous species.

Thus, the present invention also concerns a purified or isolated nucleicacid comprising a polynucleotide which is selected from the groupconsisting of the 5′ and 3′ regulatory regions, a sequence complementarythereto, and biologically active fragments or variants thereof,

The invention also pertains to a purified or isolated nucleic acidcomprising a polynucleotide having at least 95% nucleotide identity witha polynucleotide selected from the group consisting of the 5′ and 3′regulatory regions, advantageously 99% nucleotide identity, preferably99.5% nucleotide identity and most preferably 99.8% nucleotide identitywith a polynucleotide selected from the group consisting of the 5′ and3′ regulatory regions, a sequence complementary thereto, a variantthereof, and a biologically active fragment thereof.

Another object of the invention consists of purified, isolated orrecombinant nucleic acids comprising a polynucleotide that hybridizes,under the stringent hybridization conditions defined herein, with apolynucleotide selected from the group consisting of the nucleotidesequences of the 5′- and 3′ regulatory regions, a sequence complementarythereto, a variant thereof, and a biologically active fragment thereof.

Preferred fragments of the 5′ regulatory region have a length of about1500 or 1000 nucleotides, preferably of about 500 nucleotides, morepreferably about 400 nucleotides, even more preferably 300 nucleotidesand most preferably about 200 nucleotides.

Preferred fragments of the 3′ regulatory region are at least 50, 100,150, 200, 300 or 400 bases in length.

“Biologically active” regulatory polynucleotide derivatives of SEQ IDNos: 14-21 are polynucleotides comprising or alternatively consisting ofa fragment of said polynucleotide which is functional as a regulatoryregion for expressing a recombinant polypeptide or a recombinantpolynucleotide in a recombinant cell host. It could act either as anenhancer or as a repressor of transcription or translation. For thepurpose of the invention, a nucleic acid or polynucleotide is“functional” as a regulatory region for expressing a recombinantpolypeptide or a recombinant polynucleotide if said regulatorypolynucleotide contains nucleotide sequences which containtranscriptional and translational regulatory information. Such sequencescan then be “operably linked” to nucleotide sequences which encode thedesired polypeptide or the desired polynucleotide.

The regulatory polynucleotides of the invention may be prepared from thenucleotide sequence of SEQ ID No: 1 or any of SEQ ID NOs:14-21 bycleavage using suitable restriction enzymes, as described for example inSambrook et al. (1989). The regulatory polynucleotides may also beprepared by digestion of SEQ ID No:1 or any of SEQ ID NOs:14-21 by anexonuclease enzyme, such as Bal31 (Wabiko et al., 1986). Theseregulatory polynucleotides can also be prepared by nucleic acid chemicalsynthesis, as described elsewhere in the specification.

The regulatory polynucleotides according to the invention may be part ofa recombinant expression vector that may be used to express a codingsequence in a desired host cell or host organism. The recombinantexpression vectors according to the invention are described elsewhere inthe specification.

The desired nucleic acids encoded by the above-described polynucleotide,e.g. an RNA molecule, may be complementary to a desired codingpolynucleotide, for example to the GlyT1 coding sequence, and thususeful as an antisense polynucleotide.

Such a polynucleotide may be included in a recombinant expression vectorin order to express the desired polypeptide or the desired nucleic acidin host cell or in a host organism. Suitable recombinant vectors thatcontain a polynucleotide such as described herein are disclosedelsewhere in the specification.

Polynucleotide Constructs

The terms “polynucleotide construct” and “recombinant polynucleotide”are used interchangeably herein to refer to linear or circular, purifiedor isolated polynucleotides that have been artificially designed andwhich comprise at least two nucleotide sequences that are not found ascontiguous nucleotide sequences in their initial natural environment.

In order to study the physiological and phenotypic consequences of alack of synthesis of the GlyT1 protein, both at the cell level and atthe multi-cellular organism level, the invention also encompasses DNAconstructs and recombinant vectors enabling a conditional expression ofspecific cDNAs encoded by the GlyT1 genomic sequence or variants,derivatives, or fragments thereof.

The present invention embodies recombinant vectors comprising any one ofthe polynucleotides described in the present invention. Preferably, thepolynucleotide constructs according to the present invention compriseany of the polynucleotides described in the “GlyT1 cDNA Sequences”section, the “Coding Regions” section, and the “Oligonucleotide ProbesAnd Primers” section.

One preferred DNA construct is based on the tetracycline resistanceoperon tet from E. coli transposon Tn10 for controlling the GlyT1 geneexpression, such as described by Gossen et al. (1992, 1995) and Furth etal. (1994). Such a DNA construct contains seven tet operator sequencesfrom Tn10 (tetop) that are fused to either a minimal promoter and/or a5′-regulatory sequence of the GlyT1 gene, said minimal promoter or saidGlyT1 regulatory sequence being operably linked to a polynucleotide ofinterest that codes either for a sense or an antisense oligonucleotideor for a polypeptide, including a GlyT1 polypeptide (preferably a novelGlyT1 polypeptide provided herein) or a peptide fragment thereof. ThisDNA construct is functional as a conditional expression system for thenucleotide sequence of interest when the same cell also comprises anucleotide sequence coding for either the wild type (tTA) or the mutant(rTA) repressor fused to the activating domain of viral protein VP 16 ofherpes simplex virus, placed under the control of a promoter, such asthe HCMVIE1 enhancer/promoter or the MMTV-LTR. Indeed, a preferred DNAconstruct of the invention comprise both the polynucleotide containingthe tet operator sequences and the polynucleotide containing a sequencecoding for the tTA or the rTA repressor.

The present DNA constructs may be used to introduce a desired nucleotidesequence of the invention, preferably a novel GlyT1 cDNA sequence,within a predetermined location of the targeted genome, leading eitherto the generation of an altered copy of a targeted gene (knock-outhomologous recombination) or to the replacement of a copy of thetargeted gene by another copy sufficiently homologous to allow anhomologous recombination event to occur (knock-in homologousrecombination),

Nuclear Antisense DNA Constructs

Other compositions containing a vector of the invention comprising anoligonucleotide fragment of any of the nucleic acid sequences shown asSEQ ID Nos: 2-9 or 14-21, preferably a fragment including the startcodon of any of the present novel GlyT1 cDNAs, as an antisense tool thatinhibits the expression of the corresponding GlyT1 cDNA. Preferredmethods using antisense polynucleotide according to the presentinvention are the procedures described by Sczakiel et al. (1995) orthose described in PCT Application No WO 95/24223, the disclosures ofwhich are incorporated by reference herein in their entirety.

Preferably, the antisense tools are chosen among the polynucleotides(15-200 bp long) that are complementary to the 5′ end of the GlyT1 mRNA.In one embodiment, a combination of different antisense polynucleotidescomplementary to different parts of the desired targeted gene are used.Preferred antisense polynucleotides according to the present inventionare complementary to a sequence of any of the present GlyT1 mRNAs thatcontains either the translation initiation codon ATG or a splicing site.Further preferred antisense polynucleotides according to the inventionare complementary of a splicing site of any of the present GlyT1 mRNAs.

Preferably, the antisense polynucleotides of the invention have a 3′polyadenylation signal that has been replaced with a self-cleavingribozyme sequence, such that RNA polymerase II transcripts are producedwithout poly(A) at their 3′ ends, these antisense polynucleotides beingincapable of export from the nucleus, such as described by Liu et al.(1994). In a preferred embodiment, these GlyT1 antisense polynucleotidesalso comprise, within the ribozyme cassette, a histone stem-loopstructure to stabilize cleaved transcripts against 3′-5′ exonucleolyticdegradation, such as the structure described by Eckner et al. (1991).

Oligonucleotide Probes and Primers

Polynucleotides derived from the GlyT1 gene are useful in order todetect the expression of any of the novel cDNAs shown as SEQ ID Nos:14-21, or any cDNA comprising any of the novel exons shown as SEQ IDNos:2-9, or fragments, complements, or variants thereof in a testsample.

Particularly preferred probes and primers of the invention includeisolated, purified, or recombinant polynucleotides comprising,consisting of, or consisting essentially of, a contiguous span of atleast 12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200,500, 1000 or more nucleotides of SEQ ID Nos:2-9 or 14-21, or thecomplements thereof.

Thus, the invention also relates to nucleic acid probes characterized inthat they hybridize specifically, under the stringent hybridizationconditions defined above, with any of the novel cDNAs or exons describedherein, e.g. as shown as SEQ ID NOs:2-9 or 14-21, or sequencescomplementary thereto.

In a preferred embodiment, said probes comprises, consists of, orconsists essentially of a sequence selected from SEQ ID NOs:34, 35, 36,and 37, and the complementary sequences thereto.

In an additional embodiment, the invention encompasses polynucleotidesfor use in hybridization assays, sequencing assays, and enzyme-basedmismatch detection assays for determining the expression of particularcDNA species encoded by the GlyT1 gene, e.g. the expression of any ofthe herein provided novel cDNAs, or the expression of any cDNAscomprising any of the herein-provided novel exons.

The invention concerns the use of the polynucleotides according to theinvention for detecting the expression of any of the herein—providednovel cDNAs, or the expression of any cDNAs comprising any of theherein-provided novel exons, preferably in hybridization assays,sequencing assays, microsequencing assays, enzyme-based mismatchdetection assays, or by amplifying segments of nucleotides comprisingany of the present novel exons, or spanning any novel exon-exonjunctions found in any of the present novel cDNAs (i.e. novel junctionsresulting from novel exon configurations; see, e.g., Table I).

A probe or a primer according to the invention preferably has between 8and 1000 nucleotides in length, or is specified to be at least 12, 15,18, 20, 25, 35, 40, 50, 60, 70, 80, 100, 250, 500 or 1000 nucleotides inlength. More particularly, the length of these probes and primerstypically ranges from 8, 10, 15, 20, or 30 to 100 nucleotides,preferably from 10 to 50, more preferably from 15 to 30 nucleotides.Shorter probes and primers tend to lack specificity for a target nucleicacid sequence and generally require cooler temperatures to formsufficiently stable hybrid complexes with the template. Longer probesand primers are expensive to produce and can sometimes self-hybridize toform hairpin structures. The appropriate length for primers and probesunder a particular set of assay conditions may be empirically determinedby one of skill in the art.

The formation of stable hybrids depends on the melting temperature (Tm)of the DNA. The Tm depends on the length of the primer or probe, theionic strength of the solution and the G+C content. The higher the G+Ccontent of the primer or probe, the higher is the melting temperaturebecause G:C pairs are held by three H bonds whereas A:T pairs have onlytwo. The GC content in the probes of the invention usually rangesbetween 10 and 75%, preferably between 35 and 60%, and more preferablybetween 40 and 55%.

The primers and probes can be prepared by any suitable method,including, for example, cloning and restriction of appropriate sequencesand direct chemical synthesis by a method such as the phosphodiestermethod of Narang et al. (1979), the phosphodiester method of Brown etal. (1979), the diethylphosphoramidite method of Beaucage et al. (1981)and the solid support method described in EP 0 707 592.

Detection probes are generally nucleic acid sequences or unchargednucleic acid analogs such as, for example peptide nucleic acids whichare disclosed in PCT Application WO 92/20702, morpholino analogs whichare described in U.S. Pat. Nos. 5,185,444; 5,034,506 and 5,142,047. Theprobe may have to be rendered “non-extendable” in that additional dNTPscannot be added to the probe. In and of themselves analogs usually arenon-extendable and nucleic acid probes can be rendered non-extendable bymodifying the 3′ end of the probe such that the hydroxyl group is nolonger capable of participating in elongation. For example, the 3′ endof the probe can be functionalized with the capture or detection labelto thereby consume or otherwise block the hydroxyl group. Alternatively,the 3′ hydroxyl group simply can be cleaved, replaced or modified; U.S.Pat. No. 4,869,905 describes modifications which can be used to render aprobe non-extendable.

Any of the polynucleotides of the present invention can be labeled, ifdesired, by incorporating any label known in the art to be detectable byspectroscopic, photochemical, biochemical, immunochemical, or chemicalmeans. For example, useful labels include radioactive substances(including, 32P, 35S, 3H, 125I), fluorescent dyes (including,5-bromodesoxyuridin, fluorescein, acetylaminofluorene, digoxigenin) orbiotin. Preferably, polynucleotides are labeled at their 3′ and 5′ ends.Examples of non-radioactive labeling of nucleic acid fragments aredescribed in French patent No. FR-7810975 or by Urdea et al. (1988) orSanchez-Pescador et al. (1988). In addition, the probes according to thepresent invention may have structural characteristics such that theyallow the signal amplification, such structural characteristics being,for example, branched DNA probes as those described by Urdea et al.(1991) or in European patent No. EP 0 225 807 (Chiron).

A label can also be used to capture the primer, so as to facilitate theimmobilization of either the primer or a primer extension product, suchas amplified DNA, on a solid support. A capture label is attached to theprimers or probes and can be a specific binding member which forms abinding pair with the solid phase reagent's specific binding member(e.g. biotin and streptavidin). Therefore depending upon the type oflabel carried by a polynucleotide or a probe, it may be employed tocapture or to detect the target DNA. Further, it will be understood thatthe polynucleotides, primers or probes provided herein, may, themselves,serve as the capture label. For example, in the case where a solid phasereagent's binding member is a nucleic acid sequence, it may be selectedsuch that it binds a complementary portion of a primer or probe tothereby immobilize the primer or probe to the solid phase. In caseswhere a polynucleotide probe itself serves as the binding member, thoseskilled in the art will recognize that the probe will contain a sequenceor “tail” that is not complementary to the target. In the case where apolynucleotide primer itself serves as the capture label, at least aportion of the primer will be free to hybridize with a nucleic acid on asolid phase. DNA labeling techniques are well known to the skilledtechnician.

The probes of the present invention are useful for a number of purposes.They can be notably used in Southern hybridization to genomic DNA. Theprobes can also be used to detect PCR amplification products. They mayalso be used to detect mismatches in the GLYT1 gene or mRNA using othertechniques.

Any of the polynucleotides, primers and probes of the present inventioncan be conveniently immobilized on a solid support. Solid supports areknown to those skilled in the art and include the walls of wells of areaction tray, test tubes, polystyrene beads, magnetic beads,nitrocellulose strips, membranes, microparticles such as latexparticles, sheep (or other animal) red blood cells, duracytes andothers. The solid support is not critical and can be selected by oneskilled in the art. Thus, latex particles, microparticles, magnetic ornon-magnetic beads, membranes, plastic tubes, walls of microtiter wells,glass or silicon chips, sheep (or other suitable animal's) red bloodcells and duracytes are all suitable examples. Suitable methods forimmobilizing nucleic acids on solid phases include ionic, hydrophobic,covalent interactions and the like. A solid support, as used herein,refers to any material which is insoluble, or can be made insoluble by asubsequent reaction. The solid support can be chosen for its intrinsicability to attract and immobilize the capture reagent. Alternatively,the solid phase can retain an additional receptor which has the abilityto attract and immobilize the capture reagent. The additional receptorcan include a charged substance that is oppositely charged with respectto the capture reagent itself or to a charged substance conjugated tothe capture reagent. As yet another alternative, the receptor moleculecan be any specific binding member which is immobilized upon (attachedto) the solid support and which has the ability to immobilize thecapture reagent through a specific binding reaction. The receptormolecule enables the indirect binding of the capture reagent to a solidsupport material before the performance of the assay or during theperformance of the assay. The solid phase thus can be a plastic,derivatized plastic, magnetic or non-magnetic metal, glass or siliconsurface of a test tube, microtiter well, sheet, bead, microparticle,chip, sheep (or other suitable animal's) red blood cells, Duracytes® andother configurations known to those of ordinary skill in the art. Thepolynucleotides of the invention can be attached to or immobilized on asolid support individually or in groups of at least 2, 5, 8, 10, 12, 15,20, or 25 distinct polynucleotides of the invention to a single solidsupport. In addition, polynucleotides other than those of the inventionmay be attached to the same solid support as one or more polynucleotidesof the invention.

Oligonucleotide Arrays

A substrate comprising a plurality of oligonucleotide primers or probesof the invention may be used, e.g., to detect expression of a pluralityof any of the herein-provided cDNAs, or to detect the expression of oneor more of the present cDNAs in conjunction with the expression of oneor more heterologous genes.

Any polynucleotide provided herein may be attached in overlapping areasor at random locations on the solid support. Alternatively thepolynucleotides of the invention may be attached in an ordered arraywherein each polynucleotide is attached to a distinct region of thesolid support which does not overlap with the attachment site of anyother polynucleotide. Preferably, such an ordered array ofpolynucleotides is designed to be “addressable” where the distinctlocations are recorded and can be accessed as part of an assayprocedure. Addressable polynucleotide arrays typically comprise aplurality of different oligonucleotide probes that are coupled to asurface of a substrate in different known locations. The knowledge ofthe precise location of each polynucleotides location makes these“addressable” arrays particularly useful in hybridization assays. Anyaddressable array technology known in the art can be employed with thepolynucleotides of the invention. One particular embodiment of thesepolynucleotide arrays is known as the GENECHIPS, and has been generallydescribed in U.S. Pat. No. 5,143,854; PCT publications WO 90/15070 and92/10092. These arrays may generally be produced using mechanicalsynthesis methods or light directed synthesis methods which incorporatea combination of photolithographic methods and solid phaseoligonucleotide synthesis (Fodor et al., 1991). The immobilization ofarrays of oligonucleotides on solid supports has been rendered possibleby the development of a technology generally identified as “Very LargeScale Immobilized Polymer Synthesis” (VLSIPS) in which, typically,probes are immobilized in a high density array on a solid surface of achip. Examples of VLSIPS technologies are provided in U.S. Pat. Nos.5,143,854; and 5,412,087 and in PCT Publications WO 90/15070, WO92/10092 and WO 95/11995, which describe methods for formingoligonucleotide arrays through techniques such as light-directedsynthesis techniques. In designing strategies aimed at providing arraysof nucleotides immobilized on solid supports, further presentationstrategies were developed to order and display the oligonucleotidearrays on the chips in an attempt to maximize hybridization patterns andsequence information. Examples of such presentation strategies aredisclosed in PCT Publications WO 94/12305, WO 94/11530, WO 97/29212 andWO 97/31256, the disclosures of which are incorporated herein byreference in their entireties.

Consequently, the invention concerns an array of nucleic acid moleculescomprising at least one polynucleotide described above as probes andprimers. Preferably, the invention concerns an array of nucleic acidcomprising at least two polynucleotides described above as probes andprimers.

GlyT1 Proteins and Polypeptide Fragments

The term “GlyT1 polypeptides” is used herein to embrace all of theproteins and polypeptides of the present invention. Also forming part ofthe invention are polypeptides encoded by the polynucleotides of theinvention, as well as fusion polypeptides comprising such polypeptides.The invention embodies GlyT1 proteins from humans, including isolated orpurified GlyT1 proteins consisting of, consisting essentially of, orcomprising any of the sequences of SEQ ID Nos:26-33.

The invention concerns polypeptides encoded by a nucleotide sequenceselected from the group consisting of SEQ ID Nos:2-9 or 14-21, acomplementary sequence thereof or a fragment thereof.

The present invention embodies isolated, purified, and recombinantpolypeptides comprising a contiguous span of at least 6 amino acids,preferably at least 8 to 10 amino acids, more preferably at least 12,15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID Nos:26-33. Thepresent invention also embodies isolated, purified, and recombinantpolypeptides comprising a contiguous span of at least 6 amino acids,preferably at least 8 to 10 amino acids, more preferably at least 12,15, 20, 25, 30, 40, 50, or 100 amino acids encoded by any of the exonsshown as SEQ ID NOs:2-9.

The invention also encompasses purified, isolated, or recombinantpolypeptides comprising an amino acid sequence having at least 70, 75,80, 85, 90, 95, 98 or 99% amino acid identity with any of the amino acidsequences of SEQ ID NO:26-33, or any of the amino acid sequences encodedby any of the nucleic acid sequences shown as SEQ ID NO:2-9 or 14-21, ora fragment thereof.

GlyT1 proteins are preferably isolated from human or mammalian tissuesamples or expressed from human or mammalian genes. The GlyT1polypeptides of the invention can be made using routine expressionmethods known in the art. For example, a polynucleotide encoding thedesired polypeptide is ligated into an expression vector suitable forany convenient host. Either eukaryotic or prokaryotic host systems canbe used to produce recombinant polypeptides. The polypeptide is thenisolated from lysed cells or from the culture medium and purified to theextent needed for its intended use. Purification can be carried outusing any technique known in the art, for example, differentialextraction, salt fractionation, chromatography, centrifugation, and thelike. See, for example, Methods in Enzymology for a variety of methodsfor purifying proteins.

In addition, shorter protein fragments can be produced by chemicalsynthesis. Alternatively, the proteins of the invention can be extractedfrom cells or tissues of humans or non-human animals. Methods forpurifying proteins are known in the art, and include the use ofdetergents or chaotropic agents to disrupt particles followed bydifferential extraction and separation of the polypeptides by ionexchange chromatography, affinity chromatography, sedimentationaccording to density, or gel electrophoresis.

Any GlyT1 polynucleotide, preferably a novel cDNA shown as SEQ IDNOs:14-21, can be used to express GlyT1 proteins and polypeptides. Thenucleic acid encoding the GlyT1 protein or polypeptide to be expressedcan be operably linked to a promoter in an expression vector usingconventional cloning technology. The GlyT1 insert in the expressionvector may comprise the full coding sequence for the GlyT1 protein or aportion thereof. For example, the GlyT1 derived insert may encode apolypeptide comprising at least 10 consecutive amino acids of the GlyT1protein of SEQ ID Nos: 26-33, or a protein encoded by any of the nucleicacids shown as SEQ ID NOs:2-9 or 14-21.

The expression vector is any of the mammalian, yeast, insect orbacterial expression systems known in the art. Commercially availablevectors and expression systems are available from a variety of suppliersincluding Genetics Institute (Cambridge, Mass.), Stratagene (La Jolla,Calif.), Promega (Madison, Wis.), and Invitrogen (San Diego, Calif.). Ifdesired, to enhance expression and facilitate proper protein folding,the codon context and codon pairing of the sequence is optimized for theparticular expression organism in which the expression vector isintroduced, as explained by Hatfield, et al., U.S. Pat. No. 5,082,767,the disclosure of which is incorporated by reference herein in itsentirety.

In one embodiment, the entire coding sequence of the cDNA through thepoly A signal of the cDNA are operably linked to a promoter in theexpression vector. Alternatively, if the nucleic acid encoding a portionof the GlyT1 protein lacks a methionine to serve as the initiation site,an initiating methionine can be introduced next to the first codon ofthe nucleic acid using conventional techniques. Similarly, if the insertfrom the GlyT1 cDNA lacks a poly A signal, this sequence can be added tothe construct by, for example, splicing out the Poly A signal from pSG5(Stratagene) using BglI and SalI restriction endonuclease enzymes andincorporating it into the mammalian expression vector pXT1 (Stratagene).pXT1 contains the LTRs and a portion of the gag gene from Moloney MurineLeukemia Virus. The position of the LTRs in the construct allowefficient stable transfection. The vector includes the Herpes SimplexThymidine Kinase promoter and the selectable neomycin gene.

The finished constructs may be transfected into mouse NIH 3T3 cellsusing Lipofectin (Life Technologies, Inc., Grand Island, N.Y.) underconditions outlined in the product specification. Positive transfectantsare selected after growing the transfected cells in 600 ug/ml G418(Sigma, St. Louis, Mo.).

The expressed protein may be purified using conventional purificationtechniques such as ammonium sulfate precipitation or chromatographicseparation based on size or charge. The protein encoded by the nucleicacid insert may also be purified using standard immunochromatographytechniques. In such procedures, a solution containing the expressedGlyT1 protein or portion thereof, such as a cell extract, is applied toa column having antibodies against the GlyT1 protein or portion thereofis attached to the chromatography matrix. The expressed protein isallowed to bind the immunochromatography column. Thereafter, the columnis washed to remove non-specifically bound proteins. The specificallybound expressed protein is then released from the column and recoveredusing standard techniques.

To confirm expression of the GlyT1 protein or a portion thereof, theproteins expressed from host cells containing an expression vectorcontaining an insert encoding the GlyT1 protein or a portion thereof canbe compared to the proteins expressed in host cells containing theexpression vector without an insert. The presence of a band in samplesfrom cells containing the expression vector with an insert which isabsent in samples from cells containing the expression vector without aninsert indicates that the GlyT1 protein or a portion thereof is beingexpressed. Generally, the band will have the mobility expected for theGlyT1 protein or portion thereof. However, the band may have a mobilitydifferent than that expected as a result of modifications such asglycosylation, ubiquitination, or enzymatic cleavage.

Antibodies capable of specifically recognizing the expressed GlyT1protein or a portion thereof can be prepared using standard methods andare described below.

If antibody production is not possible, the nucleic acids encoding theGlyT1 protein or a portion thereof may be incorporated into anexpression vector designed for use in purification schemes employingchimeric polypeptides. In such strategies the nucleic acid encoding theGlyT1 protein or a portion thereof is inserted in frame with the geneencoding the other half of the chimera. The other half of the chimerais, e.g., beta-globin or a nickel binding polypeptide encoding sequence.A chromatography matrix having an antibody to beta-globin or nickelattached thereto is then used to purify the chimeric protein. Proteasecleavage sites is engineered between the beta-globin gene or the nickelbinding polypeptide and the GlyT1 protein or portion thereof. Thus, thetwo polypeptides of the chimera are separated from one another byprotease digestion.

One useful expression vector for generating beta-globin chimericproteins is pSG5 (Stratagene), which encodes rabbit beta-globin. IntronII of the rabbit beta-globin gene facilitates splicing of the expressedtranscript, and the polyadenylation signal incorporated into theconstruct increases the level of expression. These techniques are wellknown to those skilled in the art of molecular biology. Standard methodsare published in methods texts such as Davis et al. (1986) and many ofthe methods are available from Stratagene, Life Technologies, Inc., orPromega. Polypeptide may additionally be produced from the constructusing in vitro translation systems such as the IN VITRO EXPRESSTranslation Kit (Stratagene).

Antibodies that Bind GlyT1 Polypeptides of the Invention

Any GlyT1 polypeptide or whole protein may be used to generateantibodies capable of specifically binding to an expressed GlyT1 proteinor fragment thereof as described.

In preferred embodiments, antibodies are prepared that specificallyrecognize any of the novel GlyT1 polypeptides of the invention (e.g.polypeptides comprising a sequence shown as SEQ ID NOs:26-33), or apolypeptide comprising a sequence encoded by any of the novel exons ofthe invention (SEQ ID NOs:2-9). For an antibody composition tospecifically bind to a first variant of GlyT1, it must demonstrate atleast a 5%, 10%, 15%, 20%, 25%, 50%, or 100% greater binding affinityfor a full length first variant of the GlyT1 protein than for a fulllength second variant of the GlyT1 protein in an ELISA, RIA, or otherantibody-based binding assay.

In a preferred embodiment, the invention concerns antibody compositions,either polyclonal or monoclonal, capable of selectively binding to anepitope-containing polypeptide comprising a contiguous span of at least6 amino acids, preferably at least 8 to 10 amino acids, more preferablyat least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of any of SEQ IDNos:26-33, or encoded by SEQ ID NOs:2-9.

In a preferred embodiment, the invention concerns the use in themanufacture of antibodies of a polypeptide comprising a contiguous spanof at least 6 amino acids, preferably at least 8 to 10 amino acids, morepreferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids ofany of SEQ ID NOs:26-33, or encoded by SEQ ID NOs:2-9.

Non-human animals or mammals, whether wild-type or transgenic, whichexpress a different species of GlyT1 than the one to which antibodybinding is desired, and animals which do not express GlyT1 (i.e. a GlyT1knock out animal as described herein) are particularly useful forpreparing antibodies. GlyT1 knock out animals will recognize all or mostof the exposed regions of a GlyT1 protein as foreign antigens, andtherefore produce antibodies with a wider array of GlyT1 epitopes.Moreover, smaller polypeptides with only 10 to 30 amino acids may beuseful in obtaining specific binding to any one of the GlyT1 proteins.In addition, the humoral immune system of animals which produce aspecies of GlyT1 that resembles the antigenic sequence willpreferentially recognize the differences between the animal's nativeGlyT1 species and the antigen sequence, and produce antibodies to theseunique sites in the antigen sequence. Such a technique will beparticularly useful in obtaining antibodies that specifically bind toany one of the GlyT1 proteins.

Antibody preparations prepared according to either protocol are usefulin quantitative immunoassays which determine concentrations ofantigen-bearing substances in biological samples; they are also usedsemi-quantitatively or qualitatively to identify the presence of antigenin a biological sample. The antibodies may also be used in therapeuticcompositions for killing cells expressing the protein or reducing thelevels of the protein in the body.

The antibodies of the invention may be labeled using any of a largenumber of labels, including any one of the radioactive, fluorescent orenzymatic labels known in the art.

Consequently, the invention is also directed to a method for detectingspecifically the presence of a GlyT1 polypeptide according to theinvention in a biological sample, said method comprising bringing intocontact the biological sample with a polyclonal or monoclonal antibodythat specifically binds a GlyT1 polypeptide comprising an amino acidsequence of SEQ ID Nos:26-33, or encoded by any of the nucleic acidsequences shown as SEQ ID NOs:2-9 or 14-21; or to a peptide fragment orvariant thereof, and detecting the antigen-antibody complex formed.

The invention also concerns a diagnostic kit for detecting in vitro thepresence of a GlyT1 polypeptide according to the present invention in abiological sample, wherein said kit comprises a polyclonal or monoclonalantibody that specifically binds a GlyT1 polypeptide comprising an aminoacid sequence of SEQ ID Nos:26-33, or encoded by any of the nucleic acidsequences shown as SEQ ID NOs:2-9 or 14-21; or to a peptide fragment orvariant thereof, optionally labeled; and a reagent allowing thedetection of the antigen-antibody complexes formed, said reagentcarrying optionally a label, or being able to be recognized itself by alabeled reagent, more particularly in the case when the above-mentionedmonoclonal or polyclonal antibody is not labeled by itself.

Recombinant Vectors

The term “vector” is used herein to designate either a circular or alinear DNA or RNA molecule, which is either double-stranded orsingle-stranded, and which comprises at least one polynucleotide ofinterest that is sought to be transferred in a cell host or in aunicellular or multicellular host organism.

The present invention encompasses a family of recombinant vectors thatcomprise any regulatory or coding polynucleotide derived from any of theherein-provided novel GlyT1 cDNAs.

In a first preferred embodiment, a recombinant vector of the inventionis used to amplify an inserted polynucleotide derived from a GlyT1 cDNAin a suitable cell host, this polynucleotide being amplified at everytime that the recombinant vector replicates.

A second preferred embodiment of the recombinant vectors according tothe invention comprises expression vectors comprising a regulatorypolynucleotide and/or a coding nucleic acid of the invention. Withincertain embodiments, expression vectors are employed to express theGlyT1 polypeptide which can then be purified and, for example, be usedin ligand screening assays or as an immunogen in order to raise specificantibodies directed against the GlyT1 protein. In other embodiments, theexpression vectors are used for constructing transgenic animals and alsofor gene therapy. Expression requires that appropriate signals areprovided in the vectors, said signals including various regulatoryelements, such as enhancers/promoters from both viral and mammaliansources that drive expression of the genes of interest in host cells.Dominant drug selection markers for establishing permanent, stable cellclones expressing the products are generally included in the expressionvectors of the invention, as they are elements that link expression ofthe drug selection markers to expression of the polypeptide.

More particularly, the present invention relates to expression vectorswhich include nucleic acids encoding a GlyT1 protein, preferably a GlyT1protein of any of the amino acid sequences of SEQ ID Nos:26-33, orvariants or fragments thereof.

The invention also pertains to a recombinant expression vector usefulfor the expression of a GlyT1 coding sequence, wherein said vectorcomprises a nucleic acid of SEQ ID Nos:2-9 or 14-21.

Some of the elements which can be found in the vectors of the presentinvention are described in further detail elsewhere in the presentspecification.

The present invention also encompasses primary, secondary, andimmortalized homologously recombinant host cells of vertebrate origin,preferably mammalian origin and particularly human origin, that havebeen engineered to: a) insert exogenous heterologous) polynucleotidesinto the endogenous chromosomal DNA of a targeted gene, b) deleteendogenous chromosomal DNA, and/or c) replace endogenous chromosomal DNAwith exogenous polynucleotides. Insertions, deletions, and/orreplacements of polynucleotide sequences may be to the coding sequencesof the targeted gene and/or to regulatory regions, such as promoter andenhancer sequences, operably associated with the targeted gene.

The present invention further relates to a method of altering theexpression of a targeted gene in a cell in vitro or in vivo wherein thegene is not normally expressed in the cell, comprising the steps of: (a)transfecting the cell in vitro or in vivo with a polynucleotideconstruct, the polynucleotide construct comprising: (i) a targetingsequence; (ii) a regulatory sequence and/or a coding sequence; and (iii)an unpaired splice donor site, if necessary, thereby producing atransfected cell; and (b) maintaining the transfected cell in vitro orin vivo under conditions appropriate for homologous recombination,thereby producing a homologously recombinant cell; and (c) maintainingthe homologously recombinant cell in vitro or in vivo under conditionsappropriate for expression of the gene. Methods of making cells withaltered expression, and polynucleotide constructs used to make thecells, are also provided.

Another method for altering the expression of a targeted gene, e.g. aGlyT1 gene, is by introducing into a cell capable of expressing GlyT1 apolynucleotide whose presence in the cell alters the expression of theGlyT1 gene. For example, the polynucleotide may act to replace theendogenous GlyT1 promoter with a more or less active promoter, or maycomprise an enhancer element whose insertion into the genome in thevicinity of the GlyT1 gene results in an increase or decrease in theexpression of the GlyT1.

The compositions may be produced, and methods performed, by techniquesknown in the art, such as those described in U.S. Pat. Nos. 6,054,288;6,048,729; 6,048,724; 6,048,524; 5,994,127; 5,968,502; 5,965,125;5,869,239; 5,817,789; 5,783,385; 5,733,761; 5,641,670; 5,580,734;International Publication Nos: WO96/29411, WO 94/12650; and scientificarticles including Koller et al., (1989) Proc. Natl. Acad. Sci. USA86:8932-8935.

1. General Features of the Expression Vectors of the Invention

Recombinant vectors that can be used in the present invention include,but are not limited to, YACs (Yeast Artificial Chromosome), BACs(Bacterial Artificial Chromosome), phages, phagemids, cosmids, plasmids,and linear DNA molecules which may comprise chromosomal,non-chromosomal, semi-synthetic or synthetic DNA. Such recombinantvectors can comprise a transcriptional unit comprising an assembly of:

(1) a genetic element or elements having a regulatory role in geneexpression, for example promoters or enhancers. Enhancers are cis-actingelements of DNA, usually from about 10 to 300 bp in length that act onthe promoter to increase the transcription.

(2) a structural or coding sequence which is transcribed into mRNA andeventually translated into a polypeptide, said structural or codingsequence being operably linked to the regulatory elements described in(1); and

(3) appropriate transcription initiation and termination sequences.Structural units intended for use in yeast or eukaryotic expressionsystems preferably include a leader sequence enabling extracellularsecretion of translated protein by a host cell. Alternatively, when arecombinant protein is expressed without a leader or transport sequence,it may include a N-terminal residue. This residue may or may not besubsequently cleaved from the expressed recombinant protein to provide afinal product.

Generally, recombinant expression vectors will include origins ofreplication, selectable markers permitting transformation of the hostcell, and a promoter derived from a highly expressed gene to directtranscription of a downstream structural sequence. The heterologousstructural sequence is assembled in appropriate phase with translationinitiation and termination sequences, and preferably a leader sequencecapable of directing secretion of the translated protein into theperiplasmic space or the extracellular medium. In a specific embodimentwherein the vector is adapted for transfecting and expressing desiredsequences in mammalian host cells, preferred vectors will comprise anorigin of replication in the desired host, a suitable promoter andenhancer, and also any necessary ribosome binding sites, polyadenylationsignal, splice donor and acceptor sites, transcriptional terminationsequences, and 5′-flanking non-transcribed sequences. DNA sequencesderived from the SV40 viral genome, for example SV40 origin, earlypromoter, enhancer, splice and polyadenylation signals may be used toprovide the required non-transcribed genetic elements.

The in vivo expression of a GlyT1 polypeptide of SEQ ID Nos:26-33, orfragments or variants thereof, may be useful in order to correct agenetic defect related to the expression of the native gene in a hostorganism or to the production of a biologically inactive GlyT1 protein.

Consequently, the present invention also comprises recombinantexpression vectors mainly designed for the in vivo production of a GlyT1polypeptide of SEQ ID Nos:26-33, or fragments or variants thereof, bythe introduction of the appropriate genetic material in the organism ofthe patient to be treated. This genetic material may be introduced invitro in a cell that has been previously extracted from the organism,the modified cell being subsequently reintroduced in the said organism,directly in vivo into the appropriate tissue.

2. Regulatory Elements Promoters

The suitable promoter regions used in the expression vectors accordingto the present invention are chosen taking into account the cell host inwhich the heterologous gene has to be expressed. The particular promoteremployed to control the expression of a nucleic acid sequence ofinterest is not believed to be important, so long as it is capable ofdirecting the expression of the nucleic acid in the targeted cell. Thus,where a human cell is targeted, it is preferable to position the nucleicacid coding region adjacent to and under the control of a promoter thatis capable of being expressed in a human cell, such as, for example, ahuman or a viral promoter.

A suitable promoter may be heterologous with respect to the nucleic acidfor which it controls the expression or alternatively can be endogenousto the native polynucleotide containing the coding sequence to beexpressed. Additionally, the promoter is generally heterologous withrespect to the recombinant vector sequences within which the constructpromoter/coding sequence has been inserted.

Promoter regions can be selected from any desired gene using, forexample, CAT (chloramphenicol transferase) vectors and more preferablypKK232-8 and pCM7 vectors.

Preferred bacterial promoters are the LacI, LacZ, the T3 or T7bacteriophage RNA polymerase promoters, the gpt, lambda PR, PL and trppromoters (EP 0036776), the polyhedrin promoter, or the p10 proteinpromoter from baculovirus (Kit Novagen) (Smith et al., 1983; O'Reilly etal., 1992), the lambda PR promoter or also the trc promoter.

Eukaryotic promoters include CMV immediate early, HSV thymidine kinase,early and late SV40, LTRs from retrovirus, and mouse metallothionein-L.Selection of a convenient vector and promoter is well within the levelof ordinary skill in the art.

The choice of a promoter is well within the ability of a person skilledin the field of genetic engineering. For example, one may refer toSambrook et al. (1989) or also to the procedures described by Fuller etal. (1996).

Other Regulatory Elements

Where a cDNA insert is employed, one will typically desire to include apolyadenylation signal to effect proper polyadenylation of the genetranscript. The nature of the polyadenylation signal is not believed tobe crucial to the successful practice of the invention, and any suchsequence may be employed such as human growth hormone and SV40polyadenylation signals. Also contemplated as an element of theexpression cassette is a terminator. These elements can serve to enhancemessage levels and to minimize read through from the cassette into othersequences.

3. Selectable Markers

Such markers would confer an identifiable change to the cell permittingeasy identification of cells containing the expression construct. Theselectable marker genes for selection of transformed host cells arepreferably dihydrofolate reductase or neomycin resistance for eukaryoticcell culture, TRP1 for S. cerevisiae or tetracycline, rifampicin orampicillin resistance in E. coli, or levan saccharase for mycobacteria,this latter marker being a negative selection marker.

4. Preferred Vectors Bacterial Vectors

As a representative but non-limiting example, useful expression vectorsfor bacterial use can comprise a selectable marker and a bacterialorigin of replication derived from commercially available plasmidscomprising genetic elements of pBR322 (ATCC 37017). Such commercialvectors include, for example, pKK223-3 (Pharmacia, Uppsala, Sweden), andGEMI (Promega Biotec, Madison, Wis., USA).

Large numbers of other suitable vectors are known to those of skill inthe art, and commercially available, such as the following bacterialvectors: pQE70, pQE60, pQE-9 (Qiagen), pbs, pD10, phagescript, psiX174,pbluescript SK, pbsks, pNH8A, pNH16A, pNH18A, pNH46A (Stratagene);ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); pWLNEO, pSV2CAT,pOG44, pXT1, pSG (Stratagene); pSVK3, pBPV, pMSG, pSVL (Pharmacia);pQE-30 (QIAexpress).

Bacteriophage Vectors

The P1 bacteriophage vector may contain large inserts ranging from about80 to about 100 kb.

The construction of P1 bacteriophage vectors such as p158 or p158/neo8are notably described by Sternberg (1992, 1994). Recombinant P1 clonescomprising GlyT1 nucleotide sequences may be designed for insertinglarge polynucleotides of more than 40 kb (Linton et al., 1993). Togenerate P1 DNA for transgenic experiments, a preferred protocol is theprotocol described by McCormick et al. (1994). Briefly, E. coli(preferably strain NS3529) harboring the P1 plasmid are grown overnightin a suitable broth medium containing 25 μg/ml of kanamycin. The P1 DNAis prepared from the E. coli by alkaline lysis using the Qiagen PlasmidMaxi kit (Qiagen, Chatsworth, Calif., USA), according to themanufacturer's instructions. The P1 DNA is purified from the bacteriallysate on two Qiagen-tip 500 columns, using the washing and elutionbuffers contained in the kit. A phenol/chloroform extraction is thenperformed before precipitating the DNA with 70% ethanol. Aftersolubilizing the DNA in TE (10 mM Tris-HCl, pH 7.4, 1 mM EDTA), theconcentration of the DNA is assessed by spectrophotometry.

When the goal is to express a P1 clone comprising GlyT1 nucleotidesequences in a transgenic animal, typically in transgenic mice, it isdesirable to remove vector sequences from the P1 DNA fragment, forexample by cleaving the P1 DNA at rare-cutting sites within the P1polylinker (SfiI, NotI or SalI). The P1 insert is then purified fromvector sequences on a pulsed-field agarose gel, using methods similar tothose originally reported for the isolation of DNA from YACs (Schedl etal., 1993a; Peterson et al., 1993). At this stage, the resultingpurified insert DNA can be concentrated, if necessary, on a MilliporeUltrafree-MC Filter Unit (Millipore, Bedford, Mass., USA 30,000molecular weight limit) and then dialyzed against microinjection buffer(10 mM Tris-HCl, pH 7.4; 250 μM EDTA) containing 100 mM NaCl, 30 μMspermine, 70 μM spermidine on a microdyalisis membrane (type VS, 0.025μM from Millipore). The intactness of the purified P1 DNA insert isassessed by electrophoresis on 1% agarose (Sea Kem GTG; FMCBio-products) pulse-field gel and staining with ethidium bromide.

Baculovirus Vectors

A suitable vector for the expression of a GlyT1 polypeptide of SEQ IDNos:26-33 or fragments or variants thereof is a baculovirus vector thatcan be propagated in insect cells and in insect cell lines. A specificsuitable host vector system is the pVL1392/1393 baculovirus transfervector (Pharmingen) that is used to transfect the SF9 cell line (ATCCN^(o)CRL 1711) which is derived from Spodoptera frugiperda.

Other suitable vectors for the expression of the GlyT1 polypeptide ofSEQ ID Nos:26-33 or fragments or variants thereof in a baculovirusexpression system include those described by Chai et al. (1993), Vlasaket al. (1983) and Lenhard et al. (1996).

Viral Vectors

In one specific embodiment, the vector is derived from an adenovirus.Preferred adenovirus vectors according to the invention are thosedescribed by Feldman and Steg (1996) or Ohno et al. (1994). Anotherpreferred recombinant adenovirus according to this specific embodimentof the present invention is the human adenovirus type 2 or 5 (Ad 2 or Ad5) or an adenovirus of animal origin (French patent application N^(o)FR-93.05954).

Retrovirus vectors and adeno-associated virus vectors are generallyunderstood to be the recombinant gene delivery systems of choice for thetransfer of exogenous polynucleotides in vivo, particularly to mammals,including humans. These vectors provide efficient delivery of genes intocells, and the transferred nucleic acids are stably integrated into thechromosomal DNA of the host.

Particularly preferred retroviruses for the preparation or constructionof retroviral in vitro or in vitro gene delivery vehicles of the presentinvention include retroviruses selected from the group consisting ofMink-Cell Focus Inducing Virus, Murine Sarcoma Virus,Reticuloendotheliosis virus and Rous Sarcoma virus. Particularlypreferred Murine Leukemia Viruses include the 4070A and the 1504Aviruses, Abelson (ATCC No VR-999), Friend (ATCC No VR-245), Gross (ATCCNo VR-590), Rauscher (ATCC No VR-998) and Moloney Murine Leukemia Virus(ATCC No VR-190; PCT Application No WO 94/24298). Particularly preferredRous Sarcoma Viruses include Bryan high titer (ATCC Nos VR-334, VR-657,VR-726, VR-659 and VR-728). Other preferred retroviral vectors are thosedescribed in Roth et al. (1996), PCT Application No WO 93/25234, PCTApplication No WO 94/06920, Roux et al. (1989), Julan et al. (1992) andNeda et al. (1991).

Yet another viral vector system that is contemplated by the inventioncomprises the adeno-associated virus (AAV). The adeno-associated virusis a naturally occurring defective virus that requires another virus,such as an adenovirus or a herpes virus, as a helper virus for efficientreplication and a productive life cycle (Muzyczka et al., 1992). It isalso one of the few viruses that may integrate its DNA into non-dividingcells, and exhibits a high frequency of stable integration (Flotte etal., 1992; Samulski et al., 1989; McLaughlin et al., 1989). Oneadvantageous feature of AAV derives from its reduced efficacy fortransducing primary cells relative to transformed cells.

BAC Vectors

The bacterial artificial chromosome (BAC) cloning system (Shizuya etal., 1992) has been developed to stably maintain large fragments ofgenomic DNA (100-300 kb) in E. coli. A preferred BAC vector comprises apBeloBAC11 vector that has been described by Kim et al. (1996). BAClibraries are prepared with this vector using size-selected genomic DNAthat has been partially digested using enzymes that permit ligation intoeither the Bam HI or HindIII sites in the vector. Flanking these cloningsites are T7 and SP6 RNA polymerase transcription initiation sites thatcan be used to generate end probes by either RNA transcription or PCRmethods. After the construction of a BAC library in E. coli, BAC DNA ispurified from the host cell as a supercoiled circle. Converting thesecircular molecules into a linear form precedes both size determinationand introduction of the BACs into recipient cells. The cloning site isflanked by two Not I sites, permitting cloned segments to be excisedfrom the vector by Not I digestion. Alternatively, the DNA insertcontained in the pBeloBAC11 vector may be linearized by treatment of theBAC vector with the commercially available enzyme lambda terminase thatleads to the cleavage at the unique cos N site, but this cleavage methodresults in a full length BAC clone containing both the insert DNA andthe BAC sequences.

5. Delivery of the Recombinant Vectors

In order to effect expression of the polynucleotides and polynucleotideconstructs of the invention, these constructs must be delivered into acell (or cell extract capable of supporting protein expression). Thisdelivery may be accomplished in vitro, as in laboratory procedures fortransforming cell lines, or in vivo or ex vivo, as in the treatment ofcertain diseases states.

One mechanism is viral infection where the expression construct isencapsulated in an infectious viral particle.

Several non-viral methods for the transfer of polynucleotides intocultured mammalian cells are also contemplated by the present invention,and include, without being limited to, calcium phosphate precipitation(Graham et al., 1973; Chen et al., 1987), DEAE-dextran (Gopal, 1985),electroporation (Tur-Kaspa et al., 1986; Potter et al., 1984), directmicroinjection (Harland et al., 1985), DNA-loaded liposomes (Nicolau etal., 1982; Fraley et al., 1979), and receptor-mediated transfection (Wuand Wu, 1987; 1988). Some of these techniques may be successfullyadapted for in vivo or ex vivo use.

Once the expression polynucleotide has been delivered into the cell, itmay be stably integrated into the genome of the recipient cell. Thisintegration may be in the cognate location and orientation viahomologous recombination (gene replacement) or it may be integrated in arandom, non specific location (gene augmentation). In yet furtherembodiments, the nucleic acid may be stably maintained in the cell as aseparate, episomal segment of DNA. Such nucleic acid segments or“episomes” encode sequences sufficient to permit maintenance andreplication independent of or in synchronization with the host cellcycle.

One specific embodiment for a method for delivering a protein or peptideto the interior of a cell of a vertebrate in vivo comprises the step ofintroducing a preparation comprising a physiologically acceptablecarrier and a naked polynucleotide operatively coding for thepolypeptide of interest into the interstitial space of a tissuecomprising the cell, whereby the naked polynucleotide is taken up intothe interior of the cell and has a physiological effect. This isparticularly applicable for transfer in vitro but it may be applied toin vivo as well.

Compositions for use in vitro and in vivo comprising a “naked”polynucleotide are described in PCT application No. WO 90/11092 (VicalInc.) and also in PCT application No. WO 95/11307 (Institut Pasteur,INSERM, Université d'Ottawa) as well as in the articles of Tacson et al.(1996) and of Huygen et al. (1996).

In still another embodiment of the invention, the transfer of a nakedpolynucleotide of the invention, including a polynucleotide construct ofthe invention, into cells may be proceeded with a particle bombardment(biolistic), said particles being DNA-coated microprojectilesaccelerated to a high velocity allowing them to pierce cell membranesand enter cells without killing them, such as described by Klein et al.(1987).

In a further embodiment, the polynucleotide of the invention may beentrapped in a liposome using any of a wide variety of standard methods(see, e.g., Ghosh and Bacchawat, 1991; Wong et al., 1980; Nicolau etal., 1987).

In a specific embodiment, the invention provides a composition for thein vivo production of the GlyT1 protein or polypeptide described herein.It comprises a naked polynucleotide operatively coding for thispolypeptide, in solution in a physiologically acceptable carrier, andsuitable for introduction into a tissue to cause cells of the tissue toexpress the said protein or polypeptide.

The amount of vector to be injected to the desired host organism variesaccording to the site of injection. As an indicative dose, it will beinjected between 0.1 and 100 μg of the vector in an animal body,preferably a mammal body, for example a mouse body.

In another embodiment of the vector according to the invention, it maybe introduced in vitro in a host cell, preferably in a host cellpreviously harvested from the animal to be treated and more preferably asomatic cell such as a muscle cell. In a subsequent step, the cell thathas been transformed with the vector coding for the desired GlyT1polypeptide or the desired fragment thereof is reintroduced into theanimal body in order to deliver the recombinant protein within the bodyeither locally or systemically.

Cell Hosts

Another object of the invention comprises a host cell that has beentransformed or transfected with one of the polynucleotides describedherein, and in particular a polynucleotide either comprising a GlyT1regulatory polynucleotide or the coding sequence of the GlyT1polypeptide selected from the group consisting of SEQ ID Nos:2-9 and14-21, or a fragment or a variant thereof. Also included are host cellsthat are transformed (prokaryotic cells) or that are transfected(eukaryotic cells) with a recombinant vector such as one of thosedescribed above. More particularly, the cell hosts of the presentinvention can comprise any of the polynucleotides described in the“Genomic Sequences Of The GlyT1 Gene” section, the “GlyT1 cDNASequences” section, the “Coding Regions” section, the “Polynucleotideconstructs” section, and the “Oligonucleotide Probes And Primers”section.

An additional recombinant cell host according to the invention comprisesany of the vectors described herein, more particularly any of thevectors described in the “Recombinant Vectors” section.

Preferred host cells used as recipients for the expression vectors ofthe invention are the following:

a) Prokaryotic host cells: Escherichia coli strains (I.E.DH5-α strain),Bacillus subtilis, Salmonella typhimurium, and strains from species likePseudomonas, Streptomyces and Staphylococcus.

b) Eukaryotic host cells: HeLa cells (ATCC No. CCL2; No. CCL2.1; No.CCL2.2), Cv 1 cells (ATCC No. CCL70), COS cells (ATCC No. CRL1650; No.CRL1651), Sf-9 cells (ATCC No. CRL1711), C127 cells (ATCC No. CRL-1804),3T3 (ATCC No. CRL-6361), CHO (ATCC No. CCL-61), human kidney 293. (ATCCNo. 45504; No. CRL-1573) and BHK (ECACC No. 84100501; No. 84111301).

c) Other mammalian host cells.

The GlyT1 gene expression in mammalian, and typically human, cells maybe inhibited or enhanced with the insertion of a GlyT1 genomic or cDNAsequence with the replacement of the GlyT1 gene counterpart in thegenome of an animal cell by a GlyT1 polynucleotide according to theinvention. These genetic alterations may be generated by homologousrecombination events using specific DNA constructs that have beenpreviously described.

One kind of cell hosts that may be used are mammal zygotes, such asmurine zygotes. For example, murine zygotes may undergo microinjectionwith a purified DNA molecule of interest, for example a purified DNAmolecule that has previously been adjusted to a concentration range from1 ng/ml—for BAC inserts—3 ng/μl—for P1 bacteriophage inserts—in 10 mMTris-HCl, pH 7.4, 250 μM EDTA containing 100 mM NaCl, 30 μM spermine,and 70 μM spermidine. When the DNA to be microinjected has a large size,polyamines and high salt concentrations can be used in order to avoidmechanical breakage of this DNA, as described by Schedl et al. (1993b).

Any one of the polynucleotides of the invention, including the DNAconstructs described herein, may be introduced in an embryonic stem (ES)cell line, preferably a mouse ES cell line. ES cell lines are derivedfrom pluripotent, uncommitted cells of the inner cell mass ofpre-implantation blastocysts. Preferred ES cell lines are the following:ES-E14TG2a (ATCC n^(o) CRL-1821), ES-D3 (ATCC n^(o) CRL1934 and n^(o)CRL-11632), YS001 (ATCC n^(o) CRL-11776), 36.5 (ATCC n^(o) CRL-11116).To maintain ES cells in an uncommitted state, they are cultured in thepresence of growth inhibited feeder cells which provide the appropriatesignals to preserve this embryonic phenotype and serve as a matrix forES cell adherence. Preferred feeder cells are primary embryonicfibroblasts that are established from tissue of day 13-day 14 embryos ofvirtually any mouse strain, that are maintained in culture, such asdescribed by Abbondanzo et al. (1993) and are inhibited in growth byirradiation, such as described by Robertson (1987), or by the presenceof an inhibitory concentration of LIF, such as described by Pease andWilliams (1990).

The constructs in the host cells can be used in a conventional manner toproduce the gene product encoded by the recombinant sequence.

Following transformation of a suitable host and growth of the host to anappropriate cell density, the selected promoter is induced byappropriate means, such as temperature shift or chemical induction, andcells are cultivated for an additional period.

Cells are typically harvested by centrifugation, disrupted by physicalor chemical means, and the resulting crude extract retained for furtherpurification.

Microbial cells employed in the expression of proteins can be disruptedby any convenient method, including freeze-thaw cycling, sonication,mechanical disruption, or use of cell lysing agents. Such methods arewell known by the skill artisan.

Transgenic Animals

The terms “transgenic animals” or “host animals” are used herein todesignate animals that have their genome genetically and artificiallymanipulated so as to include one of the nucleic acids according to theinvention. Preferred animals are non-human mammals and include thosebelonging to a genus selected from Mus (e.g. mice), Rattus (e.g. rats)and Oryctogalus (e.g. rabbits) which have their genome artificially andgenetically altered by the insertion of a nucleic acid according to theinvention. In one embodiment, the invention encompasses non-human hostmammals and animals comprising a recombinant vector of the invention.

The transgenic animals of the invention all include within a pluralityof their cells a cloned recombinant or synthetic DNA sequence, morespecifically one of the purified or isolated nucleic acids comprising aGlyT1 coding sequence, a GlyT1 regulatory polynucleotide, apolynucleotide construct, or a DNA sequence encoding an antisensepolynucleotide such as described in the present specification.

Generally, a transgenic animal according the present invention comprisesany one of the polynucleotides, the recombinant vectors and the cellhosts described in the present invention. More particularly, thetransgenic animals of the present invention can comprise any of thepolynucleotides described in the “Genomic Sequences Of the GlyT1 Gene”section, the “GlyT1 cDNA Sequences” section, the “Coding Regions”section, the “Polynucleotide constructs” section, the “OligonucleotideProbes And Primers” section, the “Recombinant Vectors” section and the“Cell Hosts” section.

In a first preferred embodiment, these transgenic animals may be goodexperimental models in order to study the effects of GlyT1 activity,e.g. to study psychological disorders such as schizophrenia or otherpsychotic disorders. In one such embodiment, transgenic animals areproduced in which one or several copies of a polynucleotide encoding anyof the present novel GlyT1 proteins has been inserted into the genome.

In a second preferred embodiment, these transgenic animals may express adesired polypeptide of interest under the control of the regulatorypolynucleotides of the GlyT1 gene, leading to good yields in thesynthesis of this protein of interest, and eventually a tissue specificexpression of this protein of interest.

The design of the transgenic animals of the invention may be madeaccording to the conventional techniques well known from the one skilledin the art. For more details regarding the production of transgenicanimals, and specifically transgenic mice, it may be referred to U.S.Pat. No. 4,873,191; 5,464,764; or 5,789,215; these documents beingherein incorporated by reference to disclose methods of producingtransgenic mice.

Transgenic animals of the present invention are produced by theapplication of procedures which result in an animal with a genome thathas incorporated exogenous genetic material. The procedure involvesobtaining the genetic material, or a portion thereof, which encodeseither a GlyT1 coding sequence, a GlyT1 regulatory polynucleotide or aDNA sequence encoding a GlyT1 antisense polynucleotide such as describedin the present specification.

A recombinant polynucleotide of the invention is inserted into anembryonic or ES stem cell line. The insertion is preferably made usingelectroporation, such as described by Thomas et al. (1987). The cellssubjected to electroporation are screened (e.g. by selection viaselectable markers, by PCR or by Southern blot analysis) to findpositive cells which have integrated the exogenous recombinantpolynucleotide into their genome, preferably via an homologousrecombination event. An illustrative positive-negative selectionprocedure that may be used according to the invention is described byMansour et al. (1988).

Then, the positive cells are isolated, cloned and injected into 3.5 daysold blastocysts from mice, such as described by Bradley (1987). Theblastocysts are then inserted into a female host animal and allowed togrow to term.

Alternatively, the positive ES cells are brought into contact withembryos at the 2.5 days old 8-16 cell stage (morulae) such as describedby Wood et al. (1993) or by Nagy et al. (1993), the ES cells beinginternalized to colonize extensively the blastocyst including the cellswhich will give rise to the germ line.

The offspring of the female host are tested to determine which animalsare transgenic e.g. include the inserted exogenous DNA sequence andwhich are wild-type.

Thus, the present invention also concerns a transgenic animal containinga nucleic acid, a recombinant expression vector or a recombinant hostcell according to the invention.

Recombinant Cell Lines Derived from the Transgenic Animals of theInvention.

A further object of the invention comprises recombinant host cellsobtained from a transgenic animal described herein. In one embodimentthe invention encompasses cells derived from non-human host mammals andanimals comprising a recombinant vector of the invention or expressingany of the present novel GlyT1 polypeptides.

Recombinant cell lines may be established in vitro from cells obtainedfrom any tissue of a transgenic animal according to the invention, forexample by transfection of primary cell cultures with vectors expressingone-genes such as SV40 large T antigen, as described by Chou (1989) andShay et al. (1991).

Methods for Screening Substances Interacting with a GlyT1 Polypeptide

For the purpose of the present invention, a ligand means a molecule,such as a protein, a peptide, an antibody or any synthetic chemicalcompound capable of binding to a GlyT1 protein or one of its fragmentsor variants or to modulate the expression of the polynucleotide codingfor GlyT1 or a fragment or variant thereof.

In the ligand screening method according to the present invention, abiological sample or a defined molecule to be tested as a putativeligand of a GlyT1 protein is brought into contact with the correspondingpurified GlyT1 protein, for example the corresponding purifiedrecombinant GlyT1 protein produced by a recombinant cell host asdescribed hereinbefore, in order to form a complex between this proteinand the putative ligand molecule to be tested. In any of theherein-described assays, the GlyT1 may be present in a cell or cellmembrane during the assay.

As an illustrative example, to study the interaction of any of thepresent novel GlyT1 proteins, or a fragment comprising a contiguous spanof at least 6 amino acids, preferably at least 8 to 10 amino acids, morepreferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids ofSEQ ID NOs:26-33, or encoded by any of SEQ ID NOs:2-9 or 14-21, withdrugs or small molecules, such as molecules generated throughcombinatorial chemistry approaches, the microdialysis coupled to HPLCmethod described by Wang et al. (1997) or the affinity capillaryelectrophoresis method described by Bush et al. (1997), the disclosuresof which are incorporated by reference, can be used.

In further methods, peptides, drugs, fatty acids, lipoproteins, or smallmolecules which interact with the GlyT1 protein, or a fragmentcomprising a contiguous span of at least 6 amino acids, preferably atleast 8 to 10 amino acids, more preferably at least 12, 15, 20, 25, 30,40, 50, or 100 amino acids of SEQ ID Nos:26-33, or encoded by any of SEQID NOs:2-9 or 14-21, may be identified using assays such as thefollowing. The molecule to be tested for binding is labeled with adetectable label, such as a fluorescent radioactive, or enzymatic tagand placed in contact with immobilized GlyT1 protein, or a fragmentthereof under conditions which permit specific binding to occur. Afterremoval of non-specifically bound molecules, bound molecules aredetected using appropriate means.

Another object of the present invention comprises methods and kits forthe screening of candidate substances that interact with a GlyT1polypeptide.

The present invention pertains to methods for screening substances ofinterest that interact with a GlyT1 protein or one fragment or variantthereof. By their capacity to bind covalently or non-covalently to aGlyT1 protein or to a fragment or variant thereof, these substances ormolecules may be advantageously used both in vitro and in vivo.

In vitro, said interacting molecules may be used as detection means inorder to identify the presence of a GlyT1 protein in a sample,preferably a biological sample.

A method for the screening of a candidate substance comprises thefollowing steps:

a) providing a polypeptide comprising, consisting essentially of, orconsisting of a GlyT1 protein or a fragment comprising a contiguous spanof at least 6 amino acids, preferably at least 8 to 10 amino acids, morepreferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acids ofany of SEQ ID Nos:26-33, or encoded by any of SEQ ID NOs:2-9 or 14-21;

b) obtaining a candidate substance;

c) bringing into contact said polypeptide with said candidate substance;

d) detecting the complexes formed between said polypeptide and saidcandidate substance.

The invention further concerns a kit for the screening of a candidatesubstance interacting with the GlyT1 polypeptide, wherein said kitcomprises:

a) a GlyT1 protein having an amino acid sequence selected from the groupconsisting of any of the amino acid sequences of SEQ ID Nos:26-33, or apeptide fragment comprising a contiguous span of at least 6 amino acids,preferably at least 8 to 10 amino acids, more preferably at least 12,15, 20, 25, 30, 40, 50, or 100 amino acids of any of SEQ ID Nos:26-33,or an amino acid sequence encoded by any of SEQ ID NOs:2-9 or 14-21;

b) optionally means useful to detect the complex formed between theGlyT1 protein or a peptide fragment or a variant thereof and thecandidate substance.

In a preferred embodiment of the kit described above, the detectionmeans comprises a monoclonal or polyclonal antibodies directed againstthe GlyT1 protein or a peptide fragment or a variant thereof.

Various candidate substances or molecules can be assayed for interactionwith a GlyT1 polypeptide. These substances or molecules include, withoutbeing limited to, natural or synthetic organic compounds or molecules ofbiological origin such as polypeptides. When the candidate substance ormolecule comprises a polypeptide, this polypeptide may be the resultingexpression product of a phage clone belonging to a phage-based randompeptide library, or alternatively the polypeptide may be the resultingexpression product of a cDNA library cloned in a vector suitable forperforming a two-hybrid screening assay.

The invention also pertains to kits useful for performing theherein-described screening methods. Preferably, such kits comprise aGlyT1 polypeptide or a fragment or a variant thereof, and optionallymeans useful to detect the complex formed between the GlyT1 polypeptideor its fragment or variant and the candidate substance. In a preferredembodiment the detection means comprise a monoclonal or polyclonalantibody directed against the corresponding GlyT1 polypeptide or afragment or a variant thereof.

A. Candidate Ligands Obtained from Random Peptide Libraries

In a particular embodiment of the screening method, the putative ligandis the expression product of a DNA insert contained in a phage vector(Parmley and Smith, 1988). Specifically, random peptide phages librariesare used. The random DNA inserts encode for peptides of 8 to 20 aminoacids in length (Oldenburg K. R. et al., 1992; Valadon P., et al., 1996;Lucas A. H., 1994; Westerink M. A. J., 1995; Felici F. et al., 1991).According to this particular embodiment, the recombinant phageexpressing a protein that binds to the immobilized GlyT1 protein isretained and the complex formed between the GlyT1 protein and therecombinant phage may be subsequently immunoprecipitated by a polyclonalor a monoclonal antibody directed against the GlyT1 protein.

Once the ligand library in recombinant phages has been constructed, thephage population is brought into contact with the immobilized GlyT1protein. Then the preparation of complexes is washed in order to removethe non-specifically bound recombinant phages. The phages that bindspecifically to the GlyT1 protein are then eluted by a buffer (acid pH)or immunoprecipitated by the monoclonal antibody produced by thehybridoma anti-GlyT1, and this phage population is subsequentlyamplified by an over-infection of bacteria (for example E. coli). Theselection step may be repeated several times, preferably 2-4 times, inorder to select the more specific recombinant phage clones. The laststep comprises characterizing the peptide produced by the selectedrecombinant phage clones either by expression in infected bacteria andisolation, expressing the phage insert in another host-vector system, orsequencing the insert contained in the selected recombinant phages.

B. Candidate Ligands Obtained by Competition Experiments

Alternatively, peptides, drugs or small molecules which bind to theGlyT1 protein, or a fragment comprising a contiguous span of at least 6amino acids, preferably at least 8 to 10 amino acids, more preferably atleast 12, 15, 20, 25, 30, 40, 50, or 100 amino acids of any of SEQ IDNOs:26-33, or encoded by any of SEQ ID NOs:2-9 or 14-21, may beidentified in competition experiments. In such assays, the GlyT1protein, or a fragment thereof, is immobilized to a surface, such as aplastic plate. Increasing amounts of the peptides, drugs or smallmolecules are placed in contact with the immobilized GlyT1 protein, or afragment thereof, in the presence of a detectable labeled known GlyT1protein ligand. For example, the GlyT1 ligand may be detectably labeledwith a fluorescent, radioactive, or enzymatic tag. The ability of thetest molecule to bind the GlyT1 protein, or a fragment thereof, isdetermined by measuring the amount of detectably labeled known ligandbound in the presence of the test molecule. A decrease in the amount ofknown ligand bound to the GlyT1 protein, or a fragment thereof, when thetest molecule is present indicated that the test molecule is able tobind to the GlyT1 protein, or a fragment thereof.

C. Candidate Ligands Obtained by Affinity Chromatography

Proteins or other molecules interacting with the GlyT1 protein, or afragment comprising a contiguous span of at least 6 amino acids,preferably at least 8 to 10 amino acids, more preferably at least 12,15, 20, 25, 30, 40, 50, or 100 amino acids of SEQ ID NOs:26-33, orencoded by any of SEQ ID NOs:2-9 or 14-21, can also be found usingaffinity columns which contain the GlyT1 protein, or a fragment thereof.The GlyT1 protein, or a fragment thereof, may be attached to the columnusing conventional techniques including chemical coupling to a suitablecolumn matrix such as agarose, Affi Gel®, or other matrices familiar tothose of skill in art. In some embodiments of this method, the affinitycolumn contains chimeric proteins in which the GlyT1 protein, or afragment thereof, is fused to glutathion S transferase (GST). A mixtureof cellular proteins or pool of expressed proteins as described above isapplied to the affinity column. Proteins or other molecules interactingwith the GlyT1 protein, or a fragment thereof, attached to the columncan then be isolated and analyzed on 2-D electrophoresis gel asdescribed in Ramunsen et al. (1997), the disclosure of which isincorporated by reference. Alternatively, the proteins retained on theaffinity column can be purified by electrophoresis based methods andsequenced. The same method can be used to isolate antibodies, to screenphage display products, or to screen phage display human antibodies.

D. Candidate Ligands Obtained by Optical Biosensor Methods

Proteins interacting with the GlyT1 protein, or a fragment comprising acontiguous span of at least 6 amino acids, preferably at least 8 to 10amino acids, more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100amino acids of any of SEQ ID Nos:26-33, or encoded by any of SEQ IDNOs:2-9 or 14-21, can also be screened by using an Optical Biosensor asdescribed in Edwards and Leatherbarrow (1997) and also in Szabo et al.(1995), the disclosures of which are incorporated herein by reference.This technique permits the detection of interactions between moleculesin real time, without the need of labeled molecules. This technique isbased on the surface plasmon resonance (SPR) phenomenon. Briefly, thecandidate ligand molecule to be tested is attached to a surface (such asa carboxymethyl dextran matrix). A light beam is directed towards theside of the surface that does not contain the sample to be tested and isreflected by said surface. The SPR phenomenon causes a decrease in theintensity of the reflected light with a specific association of angleand wavelength. The binding of candidate ligand molecules cause a changein the refraction index on the surface, which change is detected as achange in the SPR signal. For screening of candidate ligand molecules orsubstances that are able to interact with the GlyT1 protein, or afragment thereof, the GlyT1 protein, or a fragment thereof, isimmobilized onto a surface. This surface comprises one side of a cellthrough which flows the candidate molecule to be assayed. The binding ofthe candidate molecule on the GlyT1 protein, or a fragment thereof, isdetected as a change of the SPR signal. The candidate molecules testedmay be proteins, peptides, carbohydrates, lipids, or small moleculesgenerated by combinatorial chemistry. This technique may also beperformed by immobilizing eukaryotic or prokaryotic cells or lipidvesicles exhibiting an endogenous or a recombinantly expressed GlyT1protein at their surface.

The main advantage of the method is that it allows the determination ofthe association rate between the GlyT1 protein and molecules interactingwith the GlyT1 protein. It is thus possible to select specificallyligand molecules interacting with the GlyT1 protein, or a fragmentthereof, through strong or conversely weak association constants.

E. Candidate Ligands Obtained Through a Two-Hybrid Screening Assay

The yeast two-hybrid system is designed to study protein-proteininteractions in vivo (Fields and Song, 1989), and relies upon the fusionof a bait protein to the DNA binding domain of the yeast Gal4 protein.This technique is also described in U.S. Pat. No. 5,667,973 and U.S.Pat. No. 5,283,173 (Fields et al.) the technical teachings of bothpatents being herein incorporated by reference.

The general procedure of library screening by the two-hybrid assay maybe performed as described by Harper et al. (1993) or as described by Choet al. (1998) or also Fromont-Racine et al. (1997).

The bait protein or polypeptide comprises, consists essentially of, orconsists of a GlyT1 polypeptide or a fragment comprising a contiguousspan of at least 6 amino acids, preferably at least 8 to 10 amino acids,more preferably at least 12, 15, 20, 25, 30, 40, 50, or 100 amino acidsof any of SEQ ID NOs:26-33, or encoded by any of SEQ ID NOs:2-9 or14-21.

More precisely, the nucleotide sequence encoding the GlyT1 polypeptideor a fragment or variant thereof is fused to a polynucleotide encodingthe DNA binding domain of the GAL4 protein, the fused nucleotidesequence being inserted into a suitable expression vector, for examplepAS2 or pM3.

Then, a human cDNA library is constructed in a specially designedvector, such that the human cDNA insert is fused to a nucleotidesequence in the vector that encodes the transcriptional domain of theGAL4 protein. Preferably, the vector used is the pACT vector. Thepolypeptides encoded by the nucleotide inserts of the human cDNA libraryare termed “prey” polypeptides.

A third vector contains a detectable marker gene, such as betagalactosidase gene or CAT gene that is placed under the control of aregulation sequence that is responsive to the binding of a complete Gal4protein containing both the transcriptional activation domain and theDNA binding domain. For example, the vector pG5EC may be used.

Two different yeast strains are also used. As an illustrative but nonlimiting example the two different yeast strains may be the following:

-   -   Y190, the phenotype of which is (MATa, Leu2-3, 112 ura3-12,        trp1-901, his3-D200, ade2-101, gal4Dgal180D URA3 GAL-LacZ, LYS        GAL-HIS3, cyh);    -   Y187, the phenotype of which is (MATa gal4 gal80 his3 trp1-901        ade2-101 ura3-52 leu2-3, 112 URA3 GAL-lacZmet⁻), which is the        opposite mating type of Y190.

Briefly, 20 μg of pAS2/GLYT1 and 20 μg of pACT-cDNA library areco-transformed into yeast strain Y190. The transformants are selectedfor growth on minimal media lacking histidine, leucine and tryptophan,but containing the histidine synthesis inhibitor 3-AT (50 mM). Positivecolonies are screened for beta galactosidase by filter lift assay. Thedouble positive colonies (His⁺, beta-gal⁺) are then grown on plateslacking histidine, leucine, but containing tryptophan and cycloheximide(10 mg/ml) to select for loss of pAS2/GLYT1 plasmids but retention ofpACT-cDNA library plasmids. The resulting Y190 strains are mated withY187 strains expressing GLYT1 or non-related control proteins; such ascyclophilin B, lamin, or SNF1, as Gal4 fusions as described by Harper etal. (1993) and by Bram et al. (Bram R J et al., 1993), and screened forbeta galactosidase by filter lift assay. Yeast clones that are betagal-after mating with the control Gal4 fusions are considered falsepositives.

In another embodiment of the two-hybrid method according to theinvention, interaction between the GlyTT1 or a fragment or variantthereof with cellular proteins may be assessed using the Matchmaker TwoHybrid System 2 (Catalog No. K1604-1, Clontech). As described in themanual accompanying the Matchmaker Two Hybrid System 2 (Catalog No.K1604-1, Clontech), the disclosure of which is incorporated herein byreference, nucleic acids encoding the GlyT1 protein or a portionthereof, are inserted into an expression vector such that they are inframe with DNA encoding the DNA binding domain of the yeasttranscriptional activator GAL4. A desired cDNA, preferably human cDNA,is inserted into a second expression vector such that they are in framewith DNA encoding the activation domain of GAL4. The two expressionplasmids are transformed into yeast and the yeast are plated onselection medium which selects for expression of selectable markers oneach of the expression vectors as well as GAL4 dependent expression ofthe HIS3 gene. Transformants capable of growing on medium lackinghistidine are screened for GAL4 dependent lacZ expression. Those cellswhich are positive in both the histidine selection and the lacZ assaycontain interaction between GlyT1 and the protein or peptide encoded bythe initially selected cDNA insert.

Methods for Identifying Modulators of GlyT1 Activity

Any of a large number of assays, agonists, and antagonists are known andcan be used to assess the activity of any of the herein-described GlyT1polypeptides. Assays include in vitro, ex vivo, and in vivo assays. Forexample, assays can be used in which the activity of the transporter ismeasured in cells (e.g. COS-7 cells, Xenopus oocytes, human embryonickidney 293 cells) transfected with nucleic acids encoding thetransporter, or using tissue homogenates or biological samples thatcontain cells naturally expressing the transporter (e.g. chondrocytes,placental choriocarcinoma cells, hippocampal pyramidal neurons), andusing any of a large number of methods to assess transporter activity,including by detecting signal transduction molecule activity or levels,levels of transcription of genes responsive to transporter activity,etc. Compounds identified as a modulator of any of the Glyt1polypeptides, and compounds found to physically interact with a Glyt1polypeptide, have a large number of uses, including for the treatment ofor prevention of a number of neurological and psychological disorders,e.g., disorders related to NMDA receptor signalling, such asschizophrenia.

The effect of a compound on GlyT1 activity can be assessed in any of alarge number of ways, including, but not limited to, by examiningglycine transport or uptake (e.g. using whole-cell patch-clamprecordings of hipposampal pyramidal neurons in vitro), synaptictransmission through vertebrate autonomic ganglia, postsynapticnicotinic acetylcholine receptor (nAChRs) activity, N-methyl-D-aspartatereceptor (NMDAR) function, any animal model for assessing NMDA receptoractivity, e.g. using behavioral assays, or any other assay for assessingglycine transport in cells or in animals.

Examples of suitable ligands for use in the present assays, includingagonists and antagonists, inhibitors or activators, include, but are notlimited to, sarcosine (GlyT1 inhibitor), alpha-methylaminoisobutyricacid (MeAIB) (inhibitor of glycine transport), glycine methyl ester,glycine ethyl ester, 2-amino-5-phosphonovaleric acid (inhibitor ofglycine transport), 7-chloro-kynurenic acid (inhibitor of glycinetransport), doxepin, amitriptyline,N[3-(4′-fluorophenyl-3-4′-phenylphenoxy)propyl]sarcosine (NFPS;inhibitor), nortriptyline, as well as any compound structurally relatedto any of these compounds, or any other compound that interacts with ormodulates any of the presently described glycine transporters. Suchcompounds can either be used as positive or negative controls in theherein-described assays, or can be included in the assay, as the testcompound is assessed for its ability to modulate the known effect of aligand on the transporter. These compounds having known activity onGlyT1 transporters can also preferably be used as “lead compounds” toidentify related compounds with potentially enhanced properties, e.g. interms of activity or absence of side effects.

As described above, the ability of a compound to alter the binding of aknown ligand (e.g. glycine), to any of the herein-described glycinetransporter, in vitro, in vivo, or ex vivo, can also be used.

Methods of assaying glycine transporter activity, and glycinetransporter interacting ligands, are described in, inter alia, Horiuchiet al. (2001) PNAS 98(4):1448-53; Tsen et al. (2000) Nat Neurosci3(2):126-32; Evans et al. (1999) FEBS Lett 463(3):301-6; Barker et al.(1999) J Physiol 514 (Pt 3):795-808; Liu et al. (1994) Biochim BiophysActa 1194(1):176-84; Kim et al. (1994) Mol Pharmacol 45(4):608-17; Liuet al. (1992) FEBS Lett 305(2):110-4; Bergeron et al. (1998) Proc NatlAcad Sci USA 95(26): 15730-4; Nunez et al. (2000) Br J Pharmacol129(1):200-6; the entire disclosure of each of which is hereinincorporated by reference.

Methods for Inhibiting the Expression of a GlyT1 cDNA

Other therapeutic compositions according to the present inventioncomprise advantageously an oligonucleotide fragment of the nucleicsequence of GlyT1 as an antisense tool to inhibit the expression of thecorresponding GlyT1 cDNA.

Preferred methods using antisense polynucleotide according to thepresent invention are the procedures described by Sczakiel et al.(1995).

Preferably, the antisense tools are chosen among the polynucleotides(15-200 bp long) that are complementary to the 5′ end of the GlyT1 mRNAof interest. In another embodiment, a combination of different antisensepolynucleotides complementary to different parts of the desired targetedgene are used.

Preferred antisense polynucleotides according to the present inventionare complementary to a sequence of the mRNAs of GlyT1 that containseither the translation initiation codon ATG or a splicing donor oracceptor site.

The antisense nucleic acids should have a length and melting temperaturesufficient to permit formation of an intracellular duplex havingsufficient stability to inhibit the expression of the GlyT1 mRNA in theduplex. Strategies for designing antisense nucleic acids suitable foruse in gene therapy are disclosed in Green et al. (1986) and Izant andWeintraub (1984), the disclosures of which are incorporated herein byreference.

In some strategies, antisense molecules are obtained by reversing theorientation of the GlyT1 coding region with respect to a promoter so asto transcribe the opposite strand from that which is normallytranscribed in the cell. The antisense molecules may be transcribedusing in vitro transcription systems such as those which employ T7 orSP6 polymerase to generate the transcript. Another approach involvestranscription of GlyT1 antisense nucleic acids in vivo by operablylinking DNA containing the antisense sequence to a promoter in asuitable expression vector.

Alternatively, suitable antisense strategies are those described byRossi et al. (1991), in International Application Nos. WO 94/23026, WO95/04141, WO 92/18522 and in European Patent Application No. EP 0 572287 A2.

An alternative to the antisense technology that is used according to thepresent invention comprises using ribozymes that will bind to a targetsequence via their complementary polynucleotide tail and that willcleave the corresponding RNA by hydrolyzing its target site (e.g.,“hammerhead ribozymes”). Briefly, the simplified cycle of a hammerheadribozyme comprises (1) sequence specific binding to the target RNA viacomplementary antisense sequences; (2) site-specific hydrolysis of thecleavable motif of the target strand; and (3) release of cleavageproducts, which gives rise to another catalytic cycle. Indeed, the useof long-chain antisense polynucleotide (at least 30 bases long) orribozymes with long antisense arms are advantageous. A preferreddelivery system for antisense ribozyme is achieved by covalently linkingthese antisense ribozymes to lipophilic groups or to use liposomes as aconvenient vector. Preferred antisense ribozymes according to thepresent invention are prepared as described by Sczakiel et al. (1995),the specific preparation procedures being referred to in said articlebeing herein incorporated by reference.

Treatment of Neurological and Psychiatric Disorders

The present GlyT1 polypeptides, polynucleotides, and modulators thereof,can be used to treat or prevent any of a large number of diseases orconditions. For example, any disease, disorder, or condition associatedwith an elevated or reduced level of glycine or glycine transporteractivity can be treated or prevented by modulating the activity orexpression of any of the herein-described polypeptides.

In one, preferred embodiment, any of the present polypeptides,polynucleotides, or modulators is used to treat or prevent a conditionassociated with abnormal NMDA receptor activity.

NMDA receptors have been implicated in a large number of neurologicaland psychological functions, including memory and learning. For example,as decreased function of NMDA-mediated neurotransmission has beensuggested to contribute to the symptoms of schizophrenia (Olney andFarber, Archives General Psychiatry 52: 998-1007 (1996), agents thatinhibit GlyT1 transporters (and thus increase glycine activation of NMDAreceptors), can be used to treat schizophrenia or other psychoticconditions. Such inhibitors can also be used to treatdementia-associated disorders, as well as other conditions such asattention deficit disorders and organic brain syndromes. In addition,activators of the transporters (which cause decreased glycine-activationof NMDA receptors) can be used to treat neuronal death associated withstroke or head trauma, as well as neurodegenerative diseases such asAlzheimer's disease, multi-infarct dementia, AIDS dementia, Parkinson'sdisease, Huntington's disease, or amyotrophic lateral sclerosis.

Pharmaceutical and Physiologically Acceptable Compositions andAdministration Thereof

To treat or present any of the herein-described disorders using any ofthe compounds described herein, the compounds may be prepared utilizingreadily available starting materials and employing common syntheticmethodologies well-known to those skilled in the art.

The effective dose of the compound can vary, depending upon factors suchas the condition of the patient, the severity of the symptoms of thedisorder, and the manner in which the pharmaceutical composition isadministered. For human patients, the effective dose of typicalcompounds generally requires administering the compound in an amount ofat least about 1, often at least about 10, and frequently at least about25 mg/24 hr./patient. For human patients, the effective dose of typicalcompounds requires administering the compound which generally does notexceed about 500, often does not exceed about 400, and frequently doesnot exceed about 300 mg/24 hr./patient. In addition, administration ofthe effective dose is such that the concentration of the compound withinthe plasma of the patient normally does not exceed 500 ng/ml, andfrequently does not exceed 100 ng/ml.

The compounds of the present invention can be administered to a patientat dosage levels in the range of about 0.1 to about 1,000 mg per day.For a normal human adult having a body weight of about 70 kilograms, adosage in the range of about 0.01 to about 100 mg per kilogram of bodyweight per day is sufficient. The specific dosage used, however, canvary. For example, the dosage can depend on a numbers of factorsincluding the requirements of the patient, the severity of the conditionbeing treated, and the pharmacological activity of the compound beingused. The determination of optimum dosages for a particular patient iswell-known to those skilled in the art. One preferred dosage is about 10mg to about 70 mg per day. In choosing a regimen for patients sufferingfrom psychotic illness it may frequently be necessary to begin with adosage of from about 30 to about 70 mg per day and when the condition isunder control to reduce the dosage as low as from about 1 to about 10 mgper day. The exact dosage will depend upon the mode of administration,form in which administered, the subject to be treated and the bodyweight of the subject to be treated, and the preference and experienceof the physician or veterinarian in charge.

Dosage levels of the order of from about 0.1 mg to about 140 mg perkilogram of body weight per day are useful in the treatment of theabove-indicated conditions (about 0.5 mg to about 7 g per patient perday). The amount of active ingredient that may be combined with thecarrier materials to produce a single dosage form will vary dependingupon the host treated and the particular mode of administration. Dosageunit forms will generally contain between from about 1 mg to about 500mg of an active ingredient.

It will be understood, however, that the specific dose level for anyparticular patient will depend upon a variety of factors including theactivity of the specific compound employed, the age, body weight,general health, sex, diet, time of administration, route ofadministration, and rate of excretion, drug combination and the severityof the particular disease undergoing therapy.

Preferred compounds useful according to the method of the presentinvention have the ability to pass across the blood-brain barrier of thepatient. As such, such compounds have the ability to enter the centralnervous system of the patient. The log P values of typical compoundsuseful in carrying out the present invention generally are greater than0, often are greater than about 1, and frequently are greater than about1.5. The log P values of such typical compounds generally are less thanabout 4, often are less than about 3.5, and frequently are less thanabout 3. Log P values provide a measure of the ability of a compound topass across a diffusion barrier, such as a biological membrane. See,Hansch, et al., J. Med. Chem., Vol. 11, p. 1 (1968). Alternatively, thecompositions of the present invention can bypass the blood brain barrierthrough the use of compositions and methods known in the art forbypassing the blood brain barrier (e.g., U.S. Pat. Nos. 5,686,416;5,994,392, incorporated by reference in their entireties) or can beinjected directly into the brain. Suitable areas include the cerebralcortex, cerebellum, midbrain, brainstem, hypothalamus spinal cord andventricular tissue, and areas of the PNS including the carotid body andthe adrenal medulla. The compositions can be administered in as a bolusor through the use of other methods such as an osmotic pump.

The compounds of the present invention can be administered to a patientalone or as part of a composition that contains other components such asexcipients, diluents, and carriers, all of which are well-known in theart. The compositions can be administered to humans and animals eitherorally, rectally, parenterally (intravenous, by intramuscularly orsubcutaneously), intracistemally, intravaginally, intraperitoneally,intravesically, locally (powders, ointments or drops), or as a buccal ornasal spray.

Compositions suitable for parenteral injection can comprisephysiologically acceptable sterile aqueous or nonaqueous solutions,dispersions, suspensions or emulsions, and sterile powders forreconstitution into sterile injectable solutions or dispersions.Examples of suitable aqueous and nonaqueous carriers, diluents, solventsor vehicles include water, ethanol, polyols (propyleneglycol,polyethyleneglycol, glycerol, and the like), suitable mixtures thereof,vegetable oils (such as olive oil) and injectable organic esters such asethyl oleate. Proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersions and by the use of surfactants.

These compositions can also contain adjuvants such as preserving,wetting, emulsifying, and dispensing agents. Prevention of the action ofmicroorganisms can be ensured by various antibacterial and antifungalagents, for example, parabens, chlorobutanol, phenol, sorbic acid, andthe like. It may also be desirable to include isotonic agents, forexample sugars, sodium chloride, and the like. Prolonged absorption ofthe injectable pharmaceutical form can be brought about by the use ofagents delaying absorption, for example, aluminum monostearate andgelatin.

Solid dosage forms for oral administration include capsules, tablets,pills, powders, and granules. In such solid dosage forms, the activecompound is admixed with at least one customary inert excipient (orcarrier) such as sodium citrate or dicalcium phosphate or (a) fillers orextenders, as for example, starches, lactose, sucrose, glucose,mannitol, and silicic acid; (b) binders, as for example,carboxymethylcellulose, alignates, gelatin, polyvinylpyrrolidone,sucrose and acacia; (c) humectants, as for example, glycerol; (d)disintegrating agents, as for example, agar-agar, calcium carbonate,potato or tapioca starch, alginic acid, certain complex silicates andsodium carbonate; (e) solution retarders, as for example paraffin; (f)absorption accelerators, as for example, quaternary ammonium compounds;(g) wetting agents, as for example, cetyl alcohol and glycerolmonostearate; (h) adsorbents, as for example, kaolin and bentonite; and(i) lubricants, as for example, talc, calcium stearate, magnesiumstearate, solid polyethylene glycols, sodium lauryl sulfate, or mixturesthereof. In the case of capsules, tablets, and pills, the dosage formsmay also comprise buffering agents.

Solid compositions of a similar type may also be employed as fillers insoft and hard-filled gelatin capsules using such excipients as lactoseor milk sugar as well as high molecular weight polyethylene glycols, andthe like. Solid dosage forms such as tablets, dragees, capsules, pills,and granules can be prepared with coatings and shells, such as entericcoatings and others well-known in the art. They may contain opacifyingagents and can also be of such composition that they release the activecompound or compounds in a certain part of the intestinal tract in adelayed manner. Examples of embedding compositions which can be used arepolymeric substances and waxes. The active compounds can also be inmicro-encapsulated form, if appropriate, with one or more of theabove-mentioned excipients.

Liquid dosage forms for oral administration include pharmaceuticallyacceptable emulsions, solutions, suspensions, syrups, and elixirs. Inaddition to the active compounds, the liquid dosage forms may containinert diluents commonly used in the art, such as water or othersolvents, solubilizing agents and emulsifiers, as for example, ethylalcohol, isopropyl alcohol, ethyl carbonate, ethyl acetate, benzylalcohol, benzyl benzoate, propylene glycol, 1,3-butylene glycol,dimethylformamide, oils, in particular, cottonseed oil, groundnut oil,corn germ oil, olive oil, castor oil and sesame oil, glycerol,tetrahydrofurfuryl alcohol, polyethylene glycols and fatty acid estersof sorbitan or mixtures of these substances, and the like. Besides suchinert diluents, the composition can also include adjuvants, such aswetting agents, emulsifying and suspending agents, sweetening,flavoring, and perfuming agents.

Suspensions, in addition to the active compounds, may contain suspendingagents, as for example, ethoxylated isostearyl alcohols, polyoxyethylenesorbitol and sorbitan esters, microcrystalline cellulose, aluminummetahydroxide, bentonite, agar-agar and tragacanth, or mixtures of thesesubstances, and the like.

Compositions for rectal administrations are preferably suppositorieswhich can be prepared by mixing the compounds of the present inventionwith suitable nonirritating excipients or carriers such as cocoa butter,polyethylene glycol or a suppository wax, which are solid at ordinarytemperatures but liquid at body temperature and therefore, melt in therectum or vaginal cavity and release the active component.

Dosage forms for topical administration of a compound of this inventioninclude ointments, powders, sprays, and inhalants. The active componentis admixed under sterile conditions with a physiologically acceptablecarrier and any preservative, buffers, or propellants as may berequired. Ophthalmic formulations, eye ointments, powders, and solutionsare also contemplated as being within the scope of this invention.

In addition, the compounds of the present invention can exist inunsolvated as well as solvated forms with pharmaceutically acceptablesolvents such as water, ethanol, and the like. In general, the solvatedforms are considered equivalent to the unsolvated forms for the purposesof the present invention.

Aqueous suspensions contain the active materials in admixture withexcipients suitable for the manufacture of aqueous suspensions. Suchexcipients are suspending agents, for example sodiumcarboxymethylcellulose, methylcellulose, hydropropylmethylcellulose,sodium alginate, polyvinylpyrrolidone, gum tragacanth and gum acacia;dispersing or wetting agents may be a naturally-occurring phosphatide,for example, lecithin, or condensation products of an alkylene oxidewith fatty acids, for example polyoxyethylene stearate, or condensationproducts of ethylene oxide with long chain aliphatic alcohols, forexample heptadecaethyleneoxycetanol, or condensation products ofethylene oxide with partial esters derived from fatty acids and ahexitol such as polyoxyethylene sorbitol monooleate, or condensationproducts of ethylene oxide with partial esters derived from fatty acidsand hexitol anhydrides, for example polyethylene sorbitan monooleate.The aqueous suspensions may also contain one or more preservatives, forexample ethyl, or n-propyl p-hydroxybenzoate, one or more coloringagents, one or more flavoring agents, and one or more sweetening agents,such as sucrose or saccharin. Oily suspensions may be formulated bysuspending the active ingredients in a vegetable oil, for examplearachis oil, olive oil, sesame oil or coconut oil, or in a mineral oilsuch as liquid paraffin. The oily suspensions may contain a thickeningagent, for example beeswax, hard paraffin or cetyl alcohol. Sweeteningagents such as those set forth above, and flavoring agents may be addedto provide palatable oral preparations. These compositions may bepreserved by the addition of an anti-oxidant such as ascorbic acid.

Dispersible powders and granules suitable for preparation of an aqueoussuspension by the addition of water provide the active ingredient inadmixture with a dispersing or wetting agent, suspending agent and oneor more preservatives. Suitable dispersing or wetting agents andsuspending agents are exemplified by those already mentioned above.Additional excipients, for example sweetening, flavoring and coloringagents, may also be present.

Pharmaceutical compositions of the invention may also be in the form ofoil-in-water emulsions. The oily phase may be a vegetable oil, forexample olive oil or arachis oil, or a mineral oil, for example liquidparaffin or mixtures of these. Suitable emulsifying agents may benaturally-occurring gums, for example gum acacia or gum tragacanth,naturally-occurring phosphatides, for example soy bean, lecithin, andesters or partial esters derived from fatty acids and hexitol,anhydrides, for example sorbitan monoleate, and condensation products ofthe said partial esters with ethylene oxide, for example polyoxyethylenesorbitan monoleate. The emulsions may also contain sweetening andflavoring agents.

Syrups and elixirs may be formulated with sweetening agents, for exampleglycerol, propylene glycol, sorbitol or sucrose. Such formulations mayalso contain a demulcent, a preservative and flavoring and coloringagents. The pharmaceutical compositions may be in the form of a sterileinjectable aqueous or oleaginous suspension. This suspension may beformulated according to the known art using those suitable dispersing orwetting agents and suspending agents which have been mentioned above.The sterile injectable preparation may also be sterile injectablesolution or suspension in a non-toxic parentally acceptable diluent orsolvent, for example as a solution in 1,3-butanediol. Among theacceptable vehicles and solvents that may be employed are water.Ringer's solution and isotonic sodium chloride solution. In addition,sterile, fixed oils are conventionally employed as a solvent orsuspending medium. For this purpose any bland fixed oil may be employedincluding synthetic mono- or diglycerides. In addition, fatty acids suchas oleic acid find use in the preparation of injectables.

The compounds of general formula I may also be administered in the formof suppositories for rectal administration of the drug. Thesecompositions can be prepared by mixing the drug with a suitablenon-irritating excipient which is solid at ordinary temperatures butliquid at the rectal temperature and will therefore melt in the rectumto release the drug. Such materials are cocoa butter and polyethyleneglycols.

Compounds of general formula I may be administered parenterally in asterile medium. The drug, depending on the vehicle and concentrationused, can either be suspended or dissolved in the vehicle.Advantageously, adjuvants such as local anesthetics, preservatives andbuffering agents can be dissolved in the vehicle.

COMPUTER-RELATED EMBODIMENTS

As used herein the term “nucleic acid codes of the invention” encompassthe nucleotide sequences comprising, consisting essentially of, orconsisting of any one of the following: a) a contiguous span of at least12, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500,or 1000 nucleotides of SEQ ID NO: 1, wherein said contiguous spancomprises at least 1, 2, 3, 5, or 10 of the following nucleotidepositions of any of SEQ ID NOs:2-9; b) a contiguous span of at least 12,15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 500, 1000nucleotides of any of SEQ ID NOs:2-9, or the full-length sequencethereof, c) a contiguous span of at least 12, 15, 18, 20, 25, 30, 35,40, 50, 60, 70, 80, 90, 100, 150, 200, 500, or 1000 nucleotides of anyof SEQ ID NOs: 14-21, or the full-length sequence thereof; and, d) anucleotide sequence complementary to any one of the preceding nucleotidesequences. The “nucleic acid codes of the invention” further encompassnucleotide sequences homologous to any of the above-described sequences.Homologous sequences refer to a sequence having at least 99%, 98%, 97%,96%, 95%, 90%, 85%, 80%, or 75% homology to these contiguous spans.Homology may be determined using any method described herein, includingBLAST2N with the default parameters or with any modified parameters.Homologous sequences also may include RNA sequences in which uridinesreplace the thymines in the nucleic acid codes of the invention. It willbe appreciated that the nucleic acid codes of the invention can berepresented in the traditional single character format (See the insideback cover of Stryer, Lubert. Biochemistry, 3^(rd) edition. W. H Freeman& Co., New York) or in any other format or code which records theidentity of the nucleotides in a sequence.

As used herein the term “polypeptide codes of the invention” encompassthe polypeptide sequences comprising a contiguous span of at least 6, 8,10, 12, 15, 20, 25, 30, 40, 50, 100 or more amino acids of any of SEQ IDNOs:26-33, or a sequence encoded by any of SEQ ID NOs:2-9 or 14-21. Itwill be appreciated that the polypeptide codes of the invention can berepresented in the traditional single character format or three letterformat (See the inside back cover of Stryer, Lubert. Biochemistry, 3rdedition. W. H Freeman & Co., New York) or in any other format or codewhich records the identity of the polypeptides in a sequence.

It will be appreciated by those skilled in the art that the nucleic acidcodes of the invention and polypeptide codes of the invention can bestored, recorded, and manipulated on any medium which can be read andaccessed by a computer. As used herein, the words “recorded” and“stored” refer to a process for storing information on a computermedium. A skilled artisan can readily adopt any of the presently knownmethods for recording information on a computer readable medium togenerate manufactures comprising one or more of the nucleic acid codesof the invention, or one or more of the polypeptide codes of theinvention. Another aspect of the present invention is a computerreadable medium having recorded thereon at least 2, 5, 10, 15, 20, 25,30, or 50 nucleic acid codes of the invention. Another aspect of thepresent invention is a computer readable medium having recorded thereonat least 2, 5, 10, 15, 20, 25, 30, or 50 polypeptide codes of theinvention.

Computer readable media include magnetically readable media, opticallyreadable media, electronically readable media and magnetic/opticalmedia. For example, the computer readable media may be a hard disk, afloppy disk, a magnetic tape, CD-ROM, Digital Versatile Disk (DVD),Random Access Memory (RAM), or Read Only Memory (ROM) as well as othertypes of other media known to those skilled in the art.

Embodiments of the present invention include systems, particularlycomputer systems which store and manipulate the sequence informationdescribed herein. As used herein, “a computer system” refers to thehardware components, software components, and data storage componentsused to analyze the nucleotide sequences of the nucleic acid codes ofthe invention or the amino acid sequences of the polypeptide codes ofthe invention. In one embodiment, the computer system is a SunEnterprise 1000 server (Sun Microsystems, Palo Alto, Calif.). Thecomputer system preferably includes a processor for processing,accessing and manipulating the sequence data. The processor can be anywell-known type of central processing unit, such as the Pentium III fromIntel Corporation, or similar processor from Sun, Motorola, Compaq orInternational Business Machines.

Preferably, the computer system is a general purpose system thatcomprises the processor and one or more internal data storage componentsfor storing data, and one or more data retrieving devices for retrievingthe data stored on the data storage components. A skilled artisan canreadily appreciate that any one of the currently available computersystems are suitable.

In one particular embodiment, the computer system includes a processorconnected to a bus which is connected to a main memory (preferablyimplemented as RAM) and one or more internal data storage devices, suchas a hard drive and/or other computer readable media having datarecorded thereon. In some embodiments, the computer system furtherincludes one or more data retrieving devices for reading the data storedon the internal data storage devices.

The data retrieving device may represent, for example, a floppy diskdrive, a compact disk drive, a magnetic tape drive, etc. In someembodiments, the internal data storage device is a removable computerreadable medium such as a floppy disk, a compact disk, a magnetic tape,etc. containing control logic and/or data recorded thereon. The computersystem may advantageously include or be programmed by appropriatesoftware for reading the control logic and/or the data from the datastorage component once inserted in the data retrieving device.

The computer system includes a display which is used to display outputto a computer user. It should also be noted that the computer system canbe linked to other computer systems in a network or wide area network toprovide centralized access to the computer system.

Software for accessing and processing the nucleotide sequences of thenucleic acid codes of the invention or the amino acid sequences of thepolypeptide codes of the invention (such as search tools, compare tools,and modeling tools etc.) may reside in main memory during execution.

In some embodiments, the computer system may further comprise a sequencecomparer for comparing the above-described nucleic acid codes of theinvention or the polypeptide codes of the invention stored on a computerreadable medium to reference nucleotide or polypeptide sequences storedon a computer readable medium. A “sequence comparer” refers to one ormore programs which are implemented on the computer system to compare anucleotide or polypeptide sequence with other nucleotide or polypeptidesequences and/or compounds including but not limited to peptides,peptidomimetics, and chemicals stored within the data storage means. Forexample, the sequence comparer may compare the nucleotide sequences ofnucleic acid codes of the invention or the amino acid sequences of thepolypeptide codes of the invention stored on a computer readable mediumto reference sequences stored on a computer readable medium to identifyhomologies, motifs implicated in biological function, or structuralmotifs. The various sequence comparer programs identified elsewhere inthis patent specification are particularly contemplated for use in thisaspect of the invention.

In one embodiment, a process is used for comparing a new nucleotide orprotein sequence with a database of sequences in order to determine thehomology levels between the new sequence and the sequences in thedatabase. The database of sequences can be a private database storedwithin the computer system, or a public database such as GENBANK, PIR ORSWISSPROT that is available through the Internet.

The process begins at a start state and then moves to a state whereinthe new sequence to be compared is stored to a memory in a computersystem. As discussed above, the memory could be any type of memory,including RAM or an internal storage device.

The process then moves to a state wherein a database of sequences isopened for analysis and comparison. The process then moves to a statewherein the first sequence stored in the database is read into a memoryon the computer. A comparison is then performed to determine if thefirst sequence is the same as the second sequence. It is important tonote that this step is not limited to performing an exact comparisonbetween the new sequence and the first sequence in the database.Well-known methods are known to those of skill in the art for comparingtwo nucleotide or protein sequences, even if they are not identical. Forexample, gaps can be introduced into one sequence in order to raise thehomology level between the two tested sequences. The parameters thatcontrol whether gaps or other features are introduced into a sequenceduring comparison are normally entered by the user of the computersystem.

Once a comparison of the two sequences has been performed, adetermination is made at a decision state whether the two sequences arethe same. Of course, the term “same” is not limited to sequences thatare absolutely identical. Sequences that are within the homologyparameters entered by the user will be marked as “same” in the process.

If a determination is made that the two sequences are the same, theprocess moves to a state wherein the name of the sequence from thedatabase is displayed to the user. This state notifies the user that thesequence with the displayed name fulfills the homology constraints thatwere entered. Once the name of the stored sequence is displayed to theuser, the process moves to a decision state wherein a determination ismade whether more sequences exist in the database. If no more sequencesexist in the database, then the process terminates at an end state.However, if more sequences do exist in the database, then the processmoves to a state wherein a pointer is moved to the next sequence in thedatabase so that it can be compared to the new sequence. In this manner,the new sequence is aligned and compared with every sequence in thedatabase.

It should be noted that if a determination had been made at the decisionstate that the sequences were not homologous, then the process wouldmove immediately to the decision state in order to determine if anyother sequences were available in the database for comparison.

Accordingly, one aspect of the present invention is a computer systemcomprising a processor, a data storage device having stored thereon anucleic acid code of the invention or a polypeptide code of theinvention, a data storage device having retrievably stored thereonreference nucleotide sequences or polypeptide sequences to be comparedto the nucleic acid code of the invention or polypeptide code of theinvention and a sequence comparer for conducting the comparison. Thesequence comparer may indicate a homology level between the sequencescompared or identify motifs implicated in biological function andstructural motifs in the nucleic acid code of the invention andpolypeptide codes of the invention or it may identify structural motifsin sequences which are compared to these nucleic acid codes andpolypeptide codes. In some embodiments, the data storage device may havestored thereon the sequences of at least 2, 5, 10, 15, 20, 25, 30, or 50of the nucleic acid codes of the invention or polypeptide codes of theinvention.

Another aspect of the present invention is a method for determining thelevel of homology between a nucleic acid code of the invention and areference nucleotide sequence, comprising the steps of reading thenucleic acid code and the reference nucleotide sequence through the useof a computer program which determines homology levels and determininghomology between the nucleic acid code and the reference nucleotidesequence with the computer program. The computer program may be any of anumber of computer programs for determining homology levels, includingthose specifically enumerated herein, including BLAST2N with the defaultparameters or with any modified parameters. The method may beimplemented using the computer systems described above. The method mayalso be performed by reading 2, 5, 10, 15, 20, 25, 30, or 50 of theabove described nucleic acid codes of the invention through the use ofthe computer program and determining homology between the nucleic acidcodes and reference nucleotide sequences.

In another embodiment, a process is carried out in a computer fordetermining whether two sequences are homologous. The process begins ata start state and then moves to a state wherein a first sequence to becompared is stored to a memory. The second sequence to be compared isthen stored in a memory. The process then moves to a state wherein thefirst character in the first sequence is read and then to a statewherein the first character of the second sequence is read. It should beunderstood that if the sequence is a nucleotide sequence, then thecharacter would normally be either A, T, C, G or U. If the sequence is aprotein sequence, then it should be in the single letter amino acid codeso that the first and sequence sequences can be easily compared.

A determination is then made at a decision state whether the twocharacters are the same. If they are the same, then the process moves toa state wherein the next characters in the first and second sequencesare read. A determination is then made whether the next characters arethe same. If they are, then the process continues this loop until twocharacters are not the same. If a determination is made that the nexttwo characters are not the same, the process moves to a decision stateto determine whether there are any more characters either sequence toread.

If there are no more characters to read, then the process moves to astate wherein the level of homology between the first and secondsequences is displayed to the user. The level of homology is determinedby calculating the proportion of characters between the sequences thatwere the same out of the total number of sequences in the firstsequence. Thus, if every character in a first nucleotide sequencealigned with a every character in a second sequence, the homology levelwould be 100%.

Alternatively, the computer program may be a computer program whichcompares the nucleotide sequences of the nucleic acid codes of thepresent invention, to reference nucleotide sequences in order todetermine whether the nucleic acid code of the invention differs from areference nucleic acid sequence at one or more positions. Optionallysuch a program records the length and identity of inserted, deleted orsubstituted nucleotides with respect to the sequence of either thereference polynucleotide or the nucleic acid code of the invention. Inone embodiment, the computer program may be a program which determineswhether the nucleotide sequences of the nucleic acid codes of theinvention contain one or more single nucleotide polymorphisms (SNP) withrespect to a reference nucleotide sequence. These single nucleotidepolymorphisms may each comprise a single base substitution, insertion,or deletion.

Another aspect of the present invention is a method for determining thelevel of homology between a polypeptide code of the invention and areference polypeptide sequence, comprising the steps of reading thepolypeptide code of the invention and the reference polypeptide sequencethrough use of a computer program which determines homology levels anddetermining homology between the polypeptide code and the referencepolypeptide sequence using the computer program.

Accordingly, another aspect of the present invention is a method fordetermining whether a nucleic acid code of the invention differs at oneor more nucleotides from a reference nucleotide sequence comprising thesteps of reading the nucleic acid code and the reference nucleotidesequence through use of a computer program which identifies differencesbetween nucleic acid sequences and identifying differences between thenucleic acid code and the reference nucleotide sequence with thecomputer program. In some embodiments, the computer program is a programwhich identifies single nucleotide polymorphisms. The method may beimplemented by the computer systems described above and the methoddescribed supra. The method may also be performed by reading at least 2,5, 10, 15, 20, 25, 30, or 50 of the nucleic acid codes of the inventionand the reference nucleotide sequences through the use of the computerprogram and identifying differences between the nucleic acid codes andthe reference nucleotide sequences with the computer program.

In other embodiments the computer based system may further comprise anidentifier for identifying features within the nucleotide sequences ofthe nucleic acid codes of the invention or the amino acid sequences ofthe polypeptide codes of the invention.

An “identifier” refers to one or more programs which identifies certainfeatures within the above-described nucleotide sequences of the nucleicacid codes of the invention or the amino acid sequences of thepolypeptide codes of the invention. In one embodiment, the identifiermay comprise a program which identifies an open reading frame in thecDNAs codes of the invention.

In another embodiment, an identifier process is used to detect thepresence of a feature in a sequence. The process begins at a start stateand then moves to a state wherein a first sequence that is to be checkedfor features is stored to a memory in the computer system. The processthen moves to a state wherein a database of sequence features is opened.Such a database would include a list of each feature's attributes alongwith the name of the feature. For example, a feature name could be“Initiation Codon” and the attribute would be “ATG”. Another examplewould be the feature name “TAATAA Box” and the feature attribute wouldbe “TAATAA”. An example of such a database is produced by the Universityof Wisconsin Genetics Computer Group (see Worldwide Website: gcg.com).

Once the database of features is opened, the process moves to a statewherein the first feature is read from the database. A comparison of theattribute of the first feature with the first sequence is then made. Adetermination is then made at a decision state whether the attribute ofthe feature was found in the first sequence. If the attribute was found,then the process moves to a state wherein the name of the found featureis displayed to the user.

The process then moves to a decision state wherein a determination ismade whether more features exist in the database. If no more features doexist, then the process terminates at an end state. However, if morefeatures do exist in the database, then the process reads the nextsequence feature and loops back to the state wherein the attribute ofthe next feature is compared against the first sequence.

It should be noted, that if the feature attribute is not found in thefirst sequence at the decision state, the process moves directly to thedecision state in order to determine if any more features exist in thedatabase.

In another embodiment, the identifier may comprise a molecular modelingprogram which determines the 3-dimensional structure of the polypeptidescodes of the invention. In some embodiments, the molecular modelingprogram identifies target sequences that are most compatible withprofiles representing the structural environments of the residues inknown three-dimensional protein structures. (See, e.g., U.S. Pat. No.5,436,850). In another technique, the known three-dimensional structuresof proteins in a given family are superimposed to define thestructurally conserved regions in that family. This protein modelingtechnique also uses the known three-dimensional structure of ahomologous protein to approximate the structure of the polypeptide codesof the invention. (See e.g., U.S. Pat. No. 5,557,535). Conventionalhomology modeling techniques have been used routinely to build models ofproteases and antibodies. (Sowdhamini et al., (1997)). Comparativeapproaches can also be used to develop three-dimensional protein modelswhen the protein of interest has poor sequence identity to templateproteins. In some cases, proteins fold into similar three-dimensionalstructures despite having very weak sequence identities. For example,the three-dimensional structures of a number of helical cytokines foldin similar three-dimensional topology in spite of weak sequencehomology.

The recent development of threading methods now enables theidentification of likely folding patterns in a number of situationswhere the structural relatedness between target and template(s) is notdetectable at the sequence level. Hybrid methods, in which foldrecognition is performed using Multiple Sequence Threading (MST),structural equivalencies are deduced from the threading output using adistance geometry program DRAGON to construct a low resolution model,and a full-atom representation is constructed using a molecular modelingpackage such as QUANTA.

According to this 3-step approach, candidate templates are firstidentified by using the novel fold recognition algorithm MST, which iscapable of performing simultaneous threading of multiple alignedsequences onto one or more 3-D structures. In a second step, thestructural equivalencies obtained from the MST output are converted intointerresidue distance restraints and fed into the distance geometryprogram DRAGON, together with auxiliary information obtained fromsecondary structure predictions. The program combines the restraints inan unbiased manner and rapidly generates a large number of lowresolution model confirmations. In a third step, these low resolutionmodel confirmations are converted into full-atom models and subjected toenergy minimization using the molecular modeling package QUANTA. (Seee.g., Aszódi et al., (1997)).

The results of the molecular modeling analysis may then be used inrational drug design techniques to identify agents which modulate theactivity of the polypeptide codes of the invention.

Accordingly, another aspect of the present invention is a method ofidentifying a feature within the nucleic acid codes of the invention orthe polypeptide codes of the invention comprising reading the nucleicacid code(s) or the polypeptide code(s) through the use of a computerprogram which identifies features therein and identifying featureswithin the nucleic acid code(s) or polypeptide code(s) with the computerprogram. In one embodiment, computer program comprises a computerprogram which identifies open reading frames. In a further embodiment,the computer program identifies structural motifs in a polypeptidesequence. In another embodiment, the computer program comprises amolecular modeling program. The method may be performed by reading asingle sequence or at least 2, 5, 10, 15, 20, 25, 30, or 50 of thenucleic acid codes of the invention or the polypeptide codes of theinvention through the use of the computer program and identifyingfeatures within the nucleic acid codes or polypeptide codes with thecomputer program.

The nucleic acid codes of the invention or the polypeptide codes of theinvention may be stored and manipulated in a variety of data processorprograms in a variety of formats. For example, they may be stored astext in a word processing file, such as MicrosoftWORD or WORDPERFECT oras an ASCII file in a variety of database programs familiar to those ofskill in the art, such as DB2, SYBASE, or ORACLE. In addition, manycomputer programs and databases may be used as sequence comparers,identifiers, or sources of reference nucleotide or polypeptide sequencesto be compared to the nucleic acid codes of the invention or thepolypeptide codes of the invention. The following list is intended notto limit the invention but to provide guidance to programs and databaseswhich are useful with the nucleic acid codes of the invention or thepolypeptide codes of the invention. The programs and databases which maybe used include, but are not limited to: MacPattern (EMBL),DiscoveryBase (Molecular Applications Group), GeneMine (MolecularApplications Group), Look (Molecular Applications Group), MacLook(Molecular Applications Group), BLAST and BLAST2 (NTCBI), BLASTN andBLASTX (Altschul et al, 1990), FASTA (Pearson and Lipman, 1988), FASTDB(Brutlag et al., 1990), Catalyst (Molecular Simulations Inc.),Catalyst/SHAPE (Molecular Simulations Inc.), Cerius2.DBAccess (MolecularSimulations Inc.), HypoGen (Molecular Simulations Inc.), Insight II,(Molecular Simulations Inc.), Discover (Molecular Simulations Inc.),CHARMm (Molecular Simulations Inc.), Felix (Molecular Simulations Inc.),DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular SimulationsInc.), Homology (Molecular Simulations Inc.), Modeler (MolecularSimulations Inc.), ISIS (Molecular Simulations Inc.), Quanta/ProteinDesign (Molecular Simulations Inc.), WebLab (Molecular SimulationsInc.), WebLab Diversity Explorer (Molecular Simulations Inc.), GeneExplorer (Molecular Simulations Inc.), SeqFold (Molecular SimulationsInc.), the EMBL/Swissprotein database, the MDL Available ChemicalsDirectory database, the MDL Drug Data Report data base, theComprehensive Medicinal Chemistry database, Derwents's World Drug Indexdatabase, the BioByteMasterFile database, the Genbank database, and theGenseqn database. Many other programs and data bases would be apparentto one of skill in the art given the present disclosure.

Motifs which may be detected using the above programs include sequencesencoding leucine zippers, helix-turn-helix motifs, glycosylation sites,ubiquitination sites, alpha helices, and beta sheets, signal sequencesencoding signal peptides which direct the secretion of the encodedproteins, sequences implicated in transcription regulation such ashomeoboxes, acidic stretches, enzymatic active sites, substrate bindingsites, and enzymatic cleavage sites.

Throughout this application, various publications, patents and publishedpatent applications are cited. The disclosures of these publications,patents and published patent specification referenced in thisapplication are hereby incorporated by reference into the presentdisclosure to more fully describe the state of the art to which thisinvention pertains.

EXAMPLES Example 1 DNA Extraction

Donors were unrelated and healthy. They presented a sufficient diversityfor being representative of a French heterogeneous population. The DNAfrom 100 individuals was extracted and tested for the detection of thebiallelic markers.

30 ml of peripheral venous blood were taken from each donor in thepresence of EDTA. Cells (pellet) were collected after centrifugation for10 minutes at 2000 rpm. Red cells were lysed by a lysis solution (50 mlfinal volume: 10 mM Tris pH7.6; 5 mM MgCl2; 10 mM NaCl). The solutionwas centrifuged (10 minutes, 2000 rpm) as many times as necessary toeliminate the residual red cells present in the supernatant, afterresuspension of the pellet in the lysis solution.

The pellet of white cells was lysed overnight at 42° C. with 3.7 ml oflysis solution composed of:

-   -   3 ml TE 10−2 (Tris-HCl 10 mM, EDTA 2 mM)/NaCl 0.4 M    -   200 μl SDS 10%    -   500 μl K-proteinase (2 mg K-proteinase in TE 10−2/NaCl 0.4 M).

For the extraction of proteins, 1 ml saturated NaCl (6M) (1/3.5 v/v) wasadded. After vigorous agitation, the solution was centrifuged for 20minutes at 10000 rpm.

For the precipitation of DNA, 2 to 3 volumes of 100% ethanol were addedto the previous supernatant, and the solution was centrifuged for 30minutes at 2000 rpm. The DNA solution was rinsed three times with 70%ethanol to eliminate salts, and centrifuged for 20 minutes at 2000 rpm.The pellet was dried at 37° C., and resuspended in 1 ml TE 10−1 or 1 mlwater. The DNA concentration was evaluated by measuring the OD at 260 nm(1 unit OD=50 μg/ml DNA).

To determine the presence of proteins in the DNA solution, the OD 260/OD280 ratio was determined. Only DNA preparations having a OD 260/OD 280ratio between 1.8 and 2 were used in the subsequent examples describedbelow.

The pool was constituted by mixing equivalent quantities of DNA fromeach individual.

Example 2 Amplification of Genomic DNA by PCR

The amplification of specific genomic sequences of the DNA samples ofexample 1 was carried out on the pool of DNA obtained previously. Inaddition, 50 individual samples were similarly amplified.

PCR assays were performed using the following protocol:

Final volume 25 μl DNA 2 ng/μl MgCl₂ 2 mM dNTP (each) 200 μM primer(each) 2.9 ng/μl Ampli Taq Gold DNA polymerase 0.05 unit/μl PCR buffer(10x = 0.1 M TrisHCl pH8.3 0.5M KCl) 1x

Each pair of first primers was designed using the sequence informationof the GLYT1 gene disclosed herein and the OSP software (Hillier &Green, 1991). This first pair of primers was about 20 nucleotides inlength and had the sequences shown as SEQ ID NOs:36 and 37.

Primers PU contain the following additional PU 5′ sequence:TGTAAAACGACGGCCAGT; primers RP contain the following RP 5′ sequence:CAGGAAACAGCTATGACC. The primer containing the additional PU 5′ sequenceis listed as SEQ ID NO:36. The primer containing the additional RP 5′sequence is listed in SEQ ID NO:37.

The synthesis of these primers was performed following thephosphoramidite method, on a GENSET UFPS 24.1 synthesizer.

DNA amplification was performed on a Genius II thermocycler. Afterheating at 95° C. for 10 min, 40 cycles were performed. Each cyclecomprised: 30 sec at 95° C., 54° C. for 1 min, and 30 sec at 72° C. Forfinal elongation, 10 min at 72° C. ended the amplification. Thequantities of the amplification products obtained were determined on96-well microtiter plates, using a fluorometer and Picogreen asintercalant agent (Molecular Probes).

In addition, RT-PCR was used to identify novel cDNAs present in thecells of normal and/or schizophrenic individuals. 8 novel splicevariants were identified, and are shown as SEQ ID NOs: 14-21 (nucleotidesequences) and SEQ ID NOs: 26-33 (polypeptide sequences) and diagrammedin FIG. 1. Certain of the novel variants include novel exons, which arepresented herein as SEQ ID NOs:2-9.

Example 3 Identification of Biallelic Markers Sequencing of AmplifiedGenomic DNA

The sequencing of the amplified DNA obtained in example 2 was carriedout on ABI 377 sequencers. The sequences of the amplification productswere determined using automated dideoxy terminator sequencing reactionswith a dye terminator cycle sequencing protocol. The products of thesequencing reactions were run on sequencing gels and the sequences weredetermined using gel image analysis (ABI Prism DNA Sequencing Analysissoftware (2.1.2 version)).

Example 4 Preparation of Antibody Compositions to the GlyT1 Protein

Substantially pure protein or polypeptide is isolated from transfectedor transformed cells containing an expression vector encoding the GlyT1protein or a portion thereof. The concentration of protein in the finalpreparation is adjusted, for example, by concentration on an Amiconfilter device, to the level of a few micrograms/ml. Monoclonal orpolyclonal antibody to the protein can then be prepared as follows:

A. Monoclonal Antibody Production by Hybridoma Fusion

Monoclonal antibody to epitopes in the GlyT1 protein or a portionthereof can be prepared from murine hybridomas according to theclassical method of Kohler, G. and Milstein, C., (1975) or derivativemethods thereof. See, also, Harlow and Lane (1988).

Briefly, a mouse is repetitively inoculated with a few micrograms of theGlyT1 protein or a portion thereof over a period of a few weeks. Themouse is then sacrificed, and the antibody producing cells of the spleenisolated. The spleen cells are fused by means of polyethylene glycolwith mouse myeloma cells, and the excess unfused cells destroyed bygrowth of the system on selective media comprising aminopterin (HATmedia). The successfully fused cells are diluted and aliquots of thedilution placed in wells of a microtiter plate where growth of theculture is continued. Antibody-producing clones are identified bydetection of antibody in the supernatant fluid of the wells byimmunoassay procedures, such as ELISA, as originally described byEngvall, (1980), and derivative methods thereof. Selected positiveclones can be expanded and their monoclonal antibody product harvestedfor use. Detailed procedures for monoclonal antibody production aredescribed in Davis, L. et al. Basic Methods in Molecular BiologyElsevier, New York. Section 21-2.

B. Polyclonal Antibody Production by Immunization

Polyclonal antiserum containing antibodies to heterogeneous epitopes inthe GlyT1 protein or a portion thereof can be prepared by immunizingsuitable non-human animal with the GlyT1 protein or a portion thereof,which can be unmodified or modified to enhance immunogenicity. Asuitable non-human animal is preferably a non-human mammal is selected,usually a mouse, rat, rabbit, goat, or horse. Alternatively, a crudepreparation which has been enriched for GlyT1 concentration can be usedto generate antibodies. Such proteins, fragments or preparations areintroduced into the non-human mammal in the presence of an appropriateadjuvant (e.g. aluminum hydroxide, RIBI, etc.) which is known in theart. In addition the protein, fragment or preparation can be pretreatedwith an agent which will increase antigenicity, such agents are known inthe art and include, for example, methylated bovine serum albumin(mBSA), bovine serum albumin (BSA), Hepatitis B surface antigen, andkeyhole limpet hemocyanin (KLH). Serum from the immunized animal iscollected, treated and tested according to known procedures. If theserum contains polyclonal antibodies to undesired epitopes, thepolyclonal antibodies can be purified by immunoaffinity chromatography.

Effective polyclonal antibody production is affected by many factorsrelated both to the antigen and the host species. Also, host animalsvary in response to site of inoculations and dose, with both inadequateor excessive doses of antigen resulting in low titer antisera. Smalldoses (ng level) of antigen administered at multiple intradermal sitesappears to be most reliable. Techniques for producing and processingpolyclonal antisera are known in the art, see for example, Mayer andWalker (1987). An effective immunization protocol for rabbits can befound in Vaitukaitis, J. et al. (1971).

Booster injections can be given at regular intervals, and antiserumharvested when antibody titer thereof, as determinedsemi-quantitatively, for example, by double immunodiffusion in agaragainst known concentrations of the antigen, begins to fall. See, forexample, Ouchterlony, O. et al., (1973). Plateau concentration ofantibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12μM). Affinity of the antisera for the antigen is determined by preparingcompetitive binding curves, as described, for example, by Fisher, D.,(1980).

Antibody preparations prepared according to either the monoclonal or thepolyclonal protocol are useful in quantitative immunoassays whichdetermine concentrations of antigen-bearing substances in biologicalsamples; they are also used semi-quantitatively or qualitatively toidentify the presence of antigen in a biological sample. The antibodiesmay also be used in therapeutic compositions for killing cellsexpressing the protein or reducing the levels of the protein in thebody.

While the preferred embodiments of the invention has been illustratedand described, it will be appreciated that various changes can be madetherein by the one skilled in the art without departing from the spiritand scope of the invention.

REFERENCES

-   Abbondanzo S J et al., 1993, Methods in Enzymology, Academic Press,    New York, pp 803-823-   Ajioka R. S. et al., Am. J. Hum. Genet., 60:1439-1447, 1997-   Altschul et al., 1990, J. Mol. Biol. 215(3):403-410-   Altschul et al., 1993, Nature Genetics 3:266-272-   Altschul et al., 1997, Nuc. Acids Res. 25:3389-3402-   Anton M. et al., 1995, J. Virol., 69: 4600-4606-   Araki K et al. (1995) Proc. Natl. Acad. Sci. USA. 92(1):160-4.-   Aszódi et al., Proteins: Structure, Function, and Genetics,    Supplement 1:38-42 (1997)-   Ausubel et al. (1989) Current Protocols in Molecular Biology, Green    Publishing Associates and Wiley Interscience, N.Y.-   Baubonis W. (1993) Nucleic Acids Res. 21(9):2025-9.-   Beaucage et al., Tetrahedron Lett 1981, 22: 1859-1862-   Bradley A., 1987, Production and analysis of chimeric mice.    In: E. J. Robertson (Ed.), Teratocarcinomas and embryonic stem    cells: A practical approach. IRL Press, Oxford, pp. 113.-   Bram R J et al., 1993, Mol. Cell Biol., 13: 4760-4769-   Brown E L, Belagaje R, Ryan M J, Khorana H G, Methods Enzymol 1979;    68:109-151-   Brutlag et al. Comp. App. Biosci. 6:237-245, 1990-   Bush et al., 1997, J. Chromatogr., 777: 311-328.-   Chai H. et al. (1993) Biotechnol. Appl. Biochem. 18:259-273.-   Chee et al. (1996) Science, 274:610-614.-   Chen and Kwok Nucleic Acids Research 25:347-353 1997-   Chen et al. (1987) Mol. Cell. Biol. 7:2745-2752.-   Chen et al. Proc. Natl. Acad, Sci. USA 94/20 10756-10761, 1997-   Cho R J et al., 1998, Proc. Natl. Acad. Sci. USA, 95(7): 3752-3757.-   Chou J. Y., 1989, Mol. Endocrinol., 3: 1511-1514.-   Clark A. G. (1990) Mol. Biol. Evol. 7:111-122.-   Coles R, Caswell R, Rubinsztein D C, Hum Mol Genet 1998; 7:791-800-   Compton J. (1991) Nature. 350(6313):91-92.-   Davis L. G., M. D. Dibner, and J. F. Battey, Basic Methods in    Molecular Biology, ed., Elsevier Press, NY, 1986-   Dempster et al., (1977) J. R. Stat. Soc., 39B:1-38.-   Dent D S & Latchman D S (1993) The DNA mobility shift assay. In:    Transcription Factors: A Practical Approach (Latchman D S, ed.) pp    1-26. Oxford: IRL Press-   Eckner R. et al. (1991) EMBO J. 10:3513-3522.-   Edwards et Leatherbarrow, Analytical Biochemistry, 246, 1-6 (1997)-   Engvall, E., Meth. Enzymol. 70:419 (1980)-   Excoffier L. and Slatkin M. (1995) Mol. Biol. Evol., 12(5): 921-927.-   Feldman and Steg, 1996, Medicine/Sciences, synthese, 12:47-55-   Felici F., 1991, J. Mol. Biol., Vol. 222:301-310-   Fields and Song, 1989, Nature, 340: 245-246-   Fisher, D., Chap. 42 in: Manual of Clinical Immunology, 2d Ed. (Rose    and Friedman, Eds.) Amer. Soc. For Microbiol., Washington, D.C.    (1980)-   Flotte et al. (1992) Am. J. Respir. Cell Mol. Biol. 7:349-356.-   Fodor et al. (1991) Science 251:767-777.-   Fraley et al. (1979) Proc. Natl. Acad. Sci. USA. 76:3348-3352.-   Fried M, Crothers D M, Nucleic Acids Res 1981; 9:6505-6525-   Fromont-Racine M. et al., 1997, Nature Genetics, 16(3): 277-282.-   Fuller S. A. et al. (1996) Immunology in Current Protocols in    Molecular Biology, Ausubel et al. Eds, John Wiley & Sons, Inc., USA.-   Furth P. A. et al. (1994) Proc. Natl. Acad. Sci. USA. 91:9302-9306.-   Garner M M, Revzin A, Nucleic Acids Res 1981; 9:3047-3060-   Geysen H. Mario et al. 1984. Proc. Natl. Acad. Sci. U.S.A.    81:3998-4002-   Ghosh and Bacchawat, 1991, Targeting of liposomes to hepatocytes,    IN: Liver Diseases, Targeted diagnosis and therapy using specific    receptors and ligands. Wu et al. Eds., Marcel Dekeker, New York, pp.    87-104.-   Gonnet et al., 1992, Science 256:1443-1445-   Gopal (1985) Mol. Cell. Biol., 5:1188-1190.-   Gossen M. et al. (1992) Proc. Natl. Acad. Sci. USA. 89:5547-5551.-   Gossen M. et al. (1995) Science. 268:1766-1769.-   Graham et al. (1973) Virology 52:456-457.-   Green et al., Ann. Rev. Biochem. 55:569-597 (1986)-   Griffin et al. Science 245:967-971 (1989)-   Grompe, M. (1993) Nature Genetics. 5:111-117.-   Grompe, M. et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86:5855-5892.-   Gu H. et al. (1993) Cell 73:1155-1164.-   Gu H. et al. (1994) Science 265:103-106.-   Guatelli J C et al. Proc. Natl. Acad. Sci. USA. 35:273-286.-   Hacia J G, Brody L C, Chee M S, Fodor S P, Collins F S, Nat Genet.    1996; 14(4):441-447-   Hall L. A. and Smirnov I. P. (1997) Genome Research, 7:378-388.-   Hames B. D. and Higgins S. J. (1985) Nucleic Acid Hybridization: A    Practical Approach. Hames and Higgins Ed., IRL Press, Oxford.-   Harju L, Weber T, Alexandrova L, Lukin M, Ranki M, Jalanko A, Clin    Chem 1993; 39(11 Pt 1):2282-2287-   Harland et al. (1985) J. Cell. Biol. 101:1094-1095.-   Harlow, E., and D. Lane. 1988. Antibodies A Laboratory Manual. Cold    Spring Harbor Laboratory. pp. 53-242-   Harper J W et al., 1993, Cell, 75: 805-816-   Hawley M. E. et al. (1994) Am. J. Phys. Anthropol. 18:104.-   Henikoff and Henikoff, 1993, Proteins 17:49-61-   Higgins et al., 1996, Methods Enzymol. 266:383-402-   Hillier L. and Green P. Methods Appl., 1991, 1: 124-8.-   Hoess et al. (1986) Nucleic Acids Res, 14:2287-2300.-   Huang L. et al. (1996) Cancer Res 56(5):1137-1141.-   Huygen et al. (1996) Nature Medicine. 2(8):893-898.-   Izant J G, Weintraub H, Cell 1984 April; 36(4):1007-15-   Julan et al. (1992) J. Gen. Virol. 73:3251-3255.-   Kanegae Y. et al., Nucl. Acids Res. 23:3816-3821 (1995).-   Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268-   Khoury J. et al., Fundamentals of Genetic Epidemiology, Oxford    University Press, NY, 1993-   Kim U-J. et al. (1996) Genomics 34:213-218.-   Klein et al. (1987) Nature. 327:70-73.-   Kohler, G. and Milstein, C., Nature 256:495 (1975)-   Koller et al. (1992) Annu. Rev. Immunol. 10:705-730.-   Kozal M J, Shah N, Shen N, Yang R, Fucini R, Merigan T C, Richman D    D, Morris D, Hubbell E, Chee M, Gingeras T R, Nat Med 1996;    2(7):753-759-   Lander and Schork, Science, 265, 2037-2048, 1994-   Landegren U. et al. (1998) Genome Research, 8:769-776.-   Lange K. (1997) Mathematical and Statistical Methods for Genetic    Analysis. Springer, New York.-   Lenhard T. et al. (1996) Gene. 169:187-190.-   Linton M. F. et al. (1993) J. Clin. Invest. 92:3029-3037.-   Liu Z. et al. (1994) Proc. Natl. Acad. Sci. USA. 91: 4528-4262.-   Livak et al., Nature Genetics, 9:341-342, 1995-   Livak K J, Hainer J W, Hum Mutat 1994; 3(4):379-385-   Lockhart et al. Nature Biotechnology 14: 1675-1680, 1996-   Lucas A. H., 1994, In: Development and Clinical Uses of Haempophilus    b Conjugate;-   Mansour S. L. et al. (1988) Nature. 336:348-352.-   Marshall R. L. et al. (1994) PCR Methods and Applications. 4:80-84.-   McCormick et al. (1994) Genet. Anal. Tech. Appl. 11:158-164.-   McLaughlin B. A. et al. (1996) Am. J. Hum. Genet. 59:561-569.-   Morton N. E., Am J. Hum. Genet., 7:277-318, 1955-   Muzyczka et al. (1992) Curr. Topics in Micro. and Immunol.    158:97-129.-   Nada S. et al. (1993) Cell 73:1125-1135.-   Nagy A. et al., 1993, Proc. Natl. Acad. Sci. USA, 90: 8424-8428.-   Narang S A, Hsiung H M, Brousseau R, Methods Enzymol 1979; 68:90-98-   Neda et al. (1991) J. Biol. Chem. 266:14143-14146.-   Newton et al. (1989) Nucleic Acids Res. 17:2503-2516.-   Nickerson D. A. et al. (1990) Proc. Natl. Acad. Sci. U.S.A.    87:8923-8927.-   Nicolau C. et al., 1987, Methods Enzymol., 149:157-76.-   Nicolau et al. (1982) Biochim. Biophys. Acta. 721:185-190.-   Nyren P, Pettersson B, Uhlen M, Anal Biochem 1993; 208(1):171-175-   O'Reilly et al. (1992) Baculovirus Expression Vectors. A Laboratory    Manual. W. H. Freeman and Co., New York.-   Ohno et al. (1994) Science. 265:781-784.-   Oldenburg K. R. et al., 1992, Proc. Natl. Acad. Sci., 89:5393-5397.-   Orita et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86: 2776-2770.-   Ott J., Analysis of Human Genetic Linkage, John Hopkins University    Press, Baltimore, 1991-   Ouchterlony, O. et al., Chap. 19 in: Handbook of Experimental    Immunology D. Wier (ed) Blackwell (1973)-   Parmley and Smith, Gene, 1988, 73:305-318-   Pastinen et al., Genome Research 1997; 7:606-614-   Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85(8):2444-2448-   Pease S. and William R. S., 1990, Exp. Cell. Res., 190: 209-211.-   Perlin et al. (1994) Am. J. Hum. Genet. 55:777-787.-   Peterson et al., 1993, Proc. Natl. Acad. Sci. USA, 90: 7593-7597.-   Pietu et al. Genome Research 6:492-503, 1996-   Potter et al. (1984) Proc. Natl. Acad. Sci. U.S.A. 81(22):7161-7165.-   Ramunsen et al., 1997, Electrophoresis, 18: 588-598.-   Reid L. H. et al. (1990) Proc. Natl. Acad. Sci. U.S.A. 87:4299-4303.-   Risch, N. and Merikangas, K. (Science, 273:1516-1517, 1996-   Robertson E., 1987, Embryo-derived stem cell lines. In: E. J.    Robertson Ed. Teratocarcinomas and embryonic stem cells. a practical    approach. IRL Press, Oxford, pp. 71.-   Rossi et al., Pharmacol. Ther. 50:245-254, (1991)-   Roth J. A. et al. (1996) Nature Medicine. 2(9):985-991.-   Roux et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86:9079-9083.-   Ruano et al. (1990) Proc. Natl. Acad. Sci. U.S.A. 87:6296-6300.-   Sambrook, J., Fritsch, E. F., and T. Maniatis. (1989) Molecular    Cloning. A Laboratory Manual. 2ed. Cold Spring Harbor Laboratory,    Cold Spring Harbor, N.Y.-   Samson M, et al. (1996) Nature, 382(6593):722-725.-   Samulski et al. (1989) J. Virol. 63:3822-3828.-   Sanchez-Pescador R. (1988) J. Clin. Microbiol. 26(10):1934-1938.-   Sarkar, G. and Sommer S. S. (1991) Biotechniques.-   Sauer B. et al. (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5166-5170.-   Schaid D. J. et al., Genet. Epidemiol., 13:423-450, 1996-   Schedl A. et al., 1993a, Nature, 362: 258-261.-   Schedl et al., 1993b, Nucleic Acids Res., 21: 4783-4787.-   Schena et al. Science 270:467-470, 1995-   Schena et al., 1996, Proc Natl Acad Sci USA, 93(20):10614-10619.-   Schneider et al. (1997) Arlequin: A Software For Population Genetics    Data Analysis. University of Geneva.-   Schwartz and Dayhoff, eds., 1978, Matrices for Detecting Distance    Relationships: Atlas of Protein Sequence and Structure, Washington:    National Biomedical Research Foundation-   Sczakiel G. et al. (1995) Trends Microbiol. 3(6):213-217.-   Shay J. W. et al., 1991, Biochem. Biophys. Acta, 1072: 1-7.-   Sheffield, V. C. et al. (1991) Proc. Natl. Acad. Sci. U.S.A.    49:699-706.-   Shizuya et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:8794-8797.-   Shoemaker D D, et al., Nat Genet. 1996; 14(4):450-456-   Smith (1957) Ann. Hum. Genet. 21:254-276.-   Smith et al. (1983) Mol. Cell. Biol. 3:2156-2165.-   Sosnowski R G, et al., Proc Natl Acad Sci USA 1997; 94:1119-1123-   Sowdhamini et al., Protein Engineering 10:207, 215 (1997)-   Spielmann S, and Ewens W. J., Am. J. Hum. Genet., 62:450-458, 1998-   Spielmann S. et al., Am. J. Hum. Genet., 52:506-516, 1993-   Sternberg N. L. (1992) Trends Genet. 8:1-16.-   Sternberg N. L. (1994) Mamm. Genome. 5:397-404.-   Stryer, L., Biochemistry, 4th edition, 1995-   Syvanen A C, Clin Chim Acta 1994; 226(2):225-236-   Szabo A. et al. Curr Opin Struct Biol 5, 699-705 (1995)-   Tacson et al. (1996) Nature Medicine. 2(8):888-892.-   Te Riele et al. (1990) Nature. 348:649-651.-   Terwilliger J. D. and Ott J., Handbook of Human Genetic Linkage,    John Hopkins University Press, London, 1994-   Thomas K. R. et al. (1986) Cell. 44:419-428.-   Thomas K. R. et al. (1987) Cell. 51:503-512.-   Thompson et al., 1994, Nucleic Acids Res. 22(2):4673-4680-   Tur-Kaspa et al. (1986) Mol. Cell. Biol. 6:716-718.-   Tyagi et al. (1998) Nature Biotechnology. 16:49-53.-   Urdea M. S. (1988) Nucleic Acids Research. 11:4937-4957.-   Urdea M. S. et al. (1991) Nucleic Acids Symp. Ser. 24:197-200.-   Vaitukaitis, J. et al. J. Clin. Endocrinol. Metab. 33:988-991 (1971)-   Valadon P., et al., 1996, J. Mol. Biol., 261:11-22.-   Van der Lugt et al. (1991) Gene. 105:263-267.-   Vlasak R. et al. (1983) Eur. J. Biochem. 135:123-126.-   Wabiko et al. (1986) DNA 5(4):305-314.-   Walker et al. (1996) Clin. Chem. 42:9-13.-   Wang et al., 1997, Chromatographia, 44: 205-208.-   Weir, B. S. (1996) Genetic data Analysis II. Methods for Discrete    population genetic Data, Sinauer Assoc., Inc., Sunderland, Mass.,    U.S.A.-   Westerink M. A. J., 1995, Proc. Natl. Acad. Sci., 92:4021-4025-   White, M. B. et al. (1992) Genomics. 12:301-306.-   White, M. B. et al. (1997) Genomics. 12:301-306.-   Wong et al. (1980) Gene. 10:87-94.-   Wood S. A. et al., 1993, Proc. Natl. Acad. Sci. USA, 90: 4582-4585.-   Wu and Wu (1987) J. Biol. Chem. 262:4429-4432.-   Wu and Wu (1988) Biochemistry. 27:887-892.-   Wu et al. (1989) Proc. Natl. Acad. Sci. U.S.A. 86:2757.-   Yagi T. et al. (1990) Proc. Natl. Acad. Sci. U.S.A. 87:9918-9922.-   Zhao et al., Am. J. Hum. Genet., 63:225-240, 1998-   Zou Y. R. et al. (1994) Curr. Biol. 4:1099-1103.

1. An isolated, purified, or recombinant polynucleotide comprising: a)SEQ ID NOs:2-9 or 14-19 or 21, or a sequence complementary to any ofthese sequences; b) a nucleic acid sequence having at least about 95%identity to SEQ ID NOs: 14-19 or 21 and encoding a functional glycinetransporter; or c) a polynucleotide which encodes a polypeptidecomprising SEQ ID NOs:26-31 or
 33. 2. The polynucleotide of claim 1,wherein said polynucleotide is attached to a solid support.
 3. Thepolynucleotide of claim 1, further comprising a label.
 4. Thepolynucleotide of claim 1, wherein said polynucleotide is operablylinked to a promoter.
 5. An array of polynucleotides comprising thepolynucleotide of claim
 1. 6. The array of claim 5, wherein said arrayis addressable.
 7. A recombinant vector comprising the polynucleotide ofclaim
 1. 8. A host cell comprising the recombinant vector of claim
 7. 9.A non-human host animal or mammal comprising the recombinant vector ofclaim
 7. 10. An isolated, purified, or recombinant polypeptidecomprising: a) SEQ ID NOs:26-33; or b) the polypeptide encoded by any ofthe nucleic acid sequences shown as SEQ ID NOs:2-9 or 14-21.
 11. Amethod of producing a GlyT1 polypeptide, said method comprising thefollowing steps: a) providing a host cell comprising a nucleic acidaccording to claim 1 operably linked to a promoter; b) cultivating saidhost cell under conditions conducive to the expression of saidpolypeptide; and c) isolating said polypeptide from said host cell. 12.An isolated or purified antibody capable of selectively binding to anepitope-containing fragment of a polypeptide according to claim
 10. 13.A method of binding an anti-GlyT1 antibody to a polypeptide comprisingcontacting said antibody with said polypeptide according to claim 10under conditions in which said antibody can specifically bind to saidpolypeptide.
 14. A diagnostic kit comprising a polynucleotide accordingto claim
 1. 15. A method of detecting the expression of a GlyT1 genewithin a cell, said method comprising the steps of: a) contacting saidcell or an extract from said cell with a polynucleotide that hybridizesunder stringent conditions to a polynucleotide encoding GlyT1 or acompound that specifically binds to GlyT1; and b) detecting the presenceor absence of hybridization between said polynucleotide and an RNAspecies within said cell or extract, or the presence or absence ofbinding of said compound to a protein within said cell or extract;wherein a detection of the presence of said hybridization or of saidbinding indicates that said GlyT1 gene is expressed within said cell.16. The method of claim 15, wherein said polynucleotide is anoligonucleotide primer, and wherein said hybridization is detected bydetecting the presence of an amplification product comprising thesequence of said primer.
 17. The method of claim 15, wherein saidcompound is an anti-GlyT1 antibody.
 18. A method of identifying acandidate modulator of a GlyT1 polypeptide, said method comprising: a)contacting the polypeptide of claim 10 with a test compound; and b)determining whether said compound specifically binds to saidpolypeptide; wherein a detection that said compound specifically bindsto said polypeptide indicates that said compound is a candidatemodulator of said GlyT1 polypeptide.
 19. The method of claim 18, furthercomprising testing the activity of said GlyT1 polypeptide in thepresence of said candidate modulator, wherein a difference in theactivity of said GlyT1 polypeptide in the presence of said candidatemodulator in comparison to the activity in the absence of said candidatemodulator indicates that the candidate modulator is a modulator of saidGlyT1 polypeptide.
 20. A method of identifying a modulator of a GlyT1polypeptide, said method comprising: a) contacting the polypeptide ofclaim 10 with a test compound; and b) detecting the activity of saidpolypeptide in the presence and absence of said compound; wherein adetection of a difference in said activity in the presence of saidcompound in comparison to the activity in the absence of said compoundindicates that said compound is a modulator of said GlyT1 polypeptide.21. The method of claim 19, wherein said polypeptide is present in acell or cell membrane, and wherein said activity comprises glycinetransport activity.
 22. The method of claim 20, wherein said polypeptideis present in a cell or cell membrane, and wherein said activitycomprises glycine transport activity.
 23. A method for pharmaceuticalcomposition comprising: a) identifying a modulator of a GlyT1polypeptide using the method of claim 20; and b) combining saidmodulator with a physiologically acceptable carrier.